search for


Insights into factors affecting synonymous codon usage in apple mosaic virus and its host adaptability
J Plant Biotechnol 2022;49:46-60
Published online March 31, 2022
© 2022 The Korean Society for Plant Biotechnology.

R. Pourrahim ・Sh. Farzadfar

Plant Virus Research Department, Iranian Research Institute of Plant Protection (IRIPP), Agricultural Research, Education and Extension Organization (AREEO), Tehran, Iran
Correspondence to: e-mail:
Received December 13, 2021; Revised January 24, 2022; Accepted January 24, 2022.
cc This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
The genetic variability and population structure of apple mosaic virus (ApMV) have been studied; however, synonymous codon usage patterns influencing the survival rates and fitness of ApMV have not been reported. Based on phylogenetic analyses of 52 ApMV coat protein (CP) sequences obtained from apple, pear, and hazelnut, ApMV isolates were clustered into two groups. High molecular diversity in GII may indicate their recent expansion. A constant and conserved genomic composition of the CP sequences was inferred from the low codon usage bias. Nucleotide composition and relative synonymous codon usage (RSCU) analysis indicated that the ApMV CP gene is AU-rich, but G- and U-ending codons are favored while coding amino acids. This unequal use of nucleotides together with parity rule 2 and the effective number of codon (ENC) plots indicate that mutation pressure together with natural selection drives codon usage patterns in the CP gene. However, in this combination, selection pressure plays a more crucial role. Based on principal component analysis plots, ApMV seems to have originated from apple trees in Europe. However, according to the relative codon deoptimization index and codon adaptation index (CAI) analyses, ApMV exhibited the greatest fitness to hazelnut. As inferred from the results of the similarity index analysis, hazelnut has a major role in shaping ApMV RSCU patterns, which is consistent with the CAI analysis results. This study contributes to the understanding of plant virus evolution, reveals novel information about ApMV evolutionary fitness, and helps find better ApMV management strategies.
Keywords : ApMV, codon usage patterns, mutation pressure, natural selection, host adaptation

Understanding the evolution of virus-host interactions is so important, due to rapid evolution through genetic recombination, mutation, the potential of adaption to new or resistant hosts (Davino et al. 2017; Garcia-Arenal et al. 2001), fast adaptation to the different environmental conditions, and mostly lack effective chemical compounds (Elena et al. 2014). As the virus translation is dependent on the host cellular machinery, the interaction of a virus with a particular host must be studied based on its codon usage pattern. A remarkable role of codon usage bias (CUB) in the evolution of viruses was reported (Angellotti et al. 2007). The codon usage pattern of viruses indicates the evolutionary changes that allow the viruses to optimize their survival and better adapt toward fitness to the external environment and, most importantly, their host (Butt et al. 2014). Natural/translational selection and the mutational/neutral model are two major models, which explain the codon usage bias (Bulmer 1991; Hershberg and Petrov 2008). The natural selection model suggests that there is a co-adaptation of synonymous codon usage and the transfer RNA (tRNA) abundance to optimize translational efficiency (Zhou et al. 1999). Therefore, the efficient use of ribosomes and maximized growth rate of fast-growing organisms will be provided by the codon usage adaptation (Hershberg and Petrov 2008). The mutational model hypothesizes that genetic compositional constraints affect the possibility of mutational fixation, and this was observed in numerous RNA viruses (Adams and Antoniw 2003). The GC content is probably to be determined mostly by genome-wide mutation bias rather than by selective forces acting specifically on coding regions. Unfortunately, the studies on CUB and its role in the evolution of plant viruses are limited (Adams and Antoniw 2003). The recent advancement in sequencing technologies allows studying the codon usage behavior of viral diseases (He et al. 2019; He et al. 2017; Liu et al. 2012; Xu et al. 2008). It is presumed that viral CP evolved more rapidly than proteins involved in replication and expression of virus genomes (Callaway et al. 2001), thus providing a strong incentive to study the diversity of viruses based on CP genes. Apple mosaic virus is a key species of subgroup III in the Ilarvirus genus (Bromoviridae family) (Bujarski et al. 2012). ApMV causes economic yield losses in pome fruits worldwide. More than 65 species of woody or herbaceous plants belonging to 19 families have been reported as naturally or experimentally host for ApMV (Brunt et al. 1996; Cieslinska and Valasevich 2016; Tzanetakis and Martin 2005). The virus is graft and mechanically transmissible and persists in propagative infected materials such as scion, rootstocks, or buds, and has no known vector (Fulton 1972). The genome of ApMV is divided into three single-stranded RNA segments in which, coat protein (CP) and movement protein (MP) are coded by RNA3 (Bujarski et al. 2012). Phylogenetic analysis using complete CP sequences divided ApMV isolates into two major clusters. One cluster involves isolates from Maloideae and Trebouxia lichen algae while the second cluster involves isolates from Prunus, hop, and the other woody trees (Grimova et al. 2013). No relation has been shown between the geographic origins and clustering of ApMV isolates (Crowle et al. 2003; Petrzik 2005).

The genetic variability and population structure of ApMV, have already been studied. However, the synonymous codon usage patterns and selection pressure analysis, which provides significant information about the virus evolution as well as gene expression and functions, have not been reported. In this study, patterns of codon usage bias were investigated using 52 complete CP nucleotide sequences of isolates from apple (M. domesticus) and pear (Pyrus sp.) from Rosaceae family and hazelnut (Corylus sp.) belonging to Betulaceae family. These analyses reveal novel information about the evolutionary fitness of ApMV.

Material and Methods

Viral isolates and phylogenetic analysis

Fifty-two full ApMV CP sequences of apple (n = 36), pear (n = 7) and hazelnut (n = 9) were retrieved from NCBI GenBank. Data on ApMV isolates, including geographical location, host origin, and the time of collection are shown in Table S1. To clarify the genetic diversity of ApMV, CP sequences were aligned using CLUSTALX2 (Kumar et al. 2018). Maximum Likelihood (ML) tree was reconstructed by MEGAX (Kumar et al. 2018) using K2 + G + I method with 1000 Bootstrap replicates. Nucleotide diversity was estimated using Kimura two parameters implemented in MEGAX (Kumar et al. 2018). The sequence pairwise identity was classified using the SDTv1.2 program. The pairwise nucleotide diversity and identity are shown using color plots.

Analysis of nucleotide composition

After deleting five non-bias codons including AUG (start codon), UGG (encoding Trp), and three termination codons UAA, UGA, and UAG, the component parameters of the ApMV CP sequences were calculated. The total percent nucleotide composition and the overall GC and AU contents were estimated by MEGAX (Kumar et al. 2018). Using CodonW 1.4.2 package, the overall frequencies of the occurrence of nucleotides (A%, U%, C%, and G%), the nucleotide at the third position of synonymous codons (A3%, U3%, C3%, and G3%), G+C at the first (GC1), second (GC2), and third (GC3) positions, and G+C at the first and second positions (GC1,2) for the CP gene sequence of each ApMV isolate were calculated. The codon usage data for the different hosts were obtained from the codon usage database (available at (Athey et al. 2017).

Analysis of relative synonymous codon usage (RSCU)

RSCU value shows the relative application of synonymous codons among the combination of codons encoding similar amino acids (Sharp and Li 1986). Codon usage is applied less frequently, if an RSCU value is equal 1.0, but RSCU values with < 0.6 and > 1.6 are indicated to be “underrepresented” and “overrepresented, respectively (Sharp et al. 1986).

Analysis of the effective number of codons (ENC)

The maximum synonymous codons bias of the ApMV CP gene was inferred by the ENC analysis. The range of ENC values is differed from 20 (an excessive codon usage bias) to 61 (non-bias), respectively. Generally, highly expressed genes have the lower ENC value with the stronger codon preference termed as optimal codons, whereas lowly expressed genes with higher ENC value illustrate that all synonymous codons are used equally (Wright 1990). The ENC value was determined using CodonW v1.4.2.

ENC-GC3s and Neutrality plot analyses

Using the ENC versus GC3s values (ENC-plot), the effect of mutational pressure or natural selection on codon usage bias is analyzed. When the points are on the standard curve it shows that mutation pressure is the lonely factor for driving the codon usage bias. Otherwise, if the selection were the main force, the ENC values would lie lower than the standard curve (Wright 1990). In addition, the neutral evolution analysis was done to determine the influence rate of natural selection and mutation pressure on codon usage patterns of the ApMV CP gene by plotting the GC1,2s values of the synonymous codons against and GC3s values. GC3 indicates the abundance of G+C at the third codon position and GC12 represents the average of GC1 and GC2. The mutation pressure is shown using the slope of the regression line plotted between the GC3s and GC1,2s contents. Weak or no exterior selection pressure is indicated where regression line (s) near to the diagonal (slope = 1.0). Conversely, the deviation of regression curves from the diagonal demonstrates considerable effects of natural selection on codon usage bias.

Parity rule 2 (PR2) and Principal component (PCA) analysis

Parity rule 2 (PR2) plot shows the influence of natural selection and mutation pressure on the codon usage of each gene using A3/(A3 + U3) value plotted versus G3/(G3 + C3) value. The center of the PR2 plot is 0.5 which indicates A=U and G = C (Sueoka 1999). If there is no deviance between mutation pressure and selection pressure, the points are placed in the center of the plot and vice versa. Furthermore, the significant tendency in codon usage variation of the ApMV CP sequences was examined by PCA analysis, which demonstrated the significant tendency in codon usage variation (Zhou et al. 1999). PCA plot of the 1st axis and the 2nd axis of the isolated strains according to the phylogroups were drawn.

Analysis of CAI, RCDI, and similarity index (SiD) indexes

The codon adaptation index (CAI) value for ApMV CP sequences was determined using the CAIcal SERVER ( The CAI values ranging from 0.0 to 1.0 indicate the various degrees of adaptation to the host. The high CAI value of a sequence shows its stronger adaptability to the host, and conversely (Puigbò et al. 2010). In addition, the relative codon deoptimization index (RCDI) value of 1.0 shows that the virus acts in accordance with the host codon usage patterns. Otherwise, RCDI values of more than 1.0 show lower compatibility. The RCDI values were determined using the RCDI/eRCDI server ( The influence of the codon usage bias of the hosts was measured by SiD value. The SiD was determined in this way:

In this formula ai shows the RSCU value of 59 synonymous codons of the ApMV CP gene, and bi shows the RSCU value of the identical codons of the potential host. The SiD values vary between 0.0 to 1.0 and the higher values show that the host has a deeper effect on the usage of codons.


Phylogenetic analyses clustered the 52 ApMV isolates into two main groups, in which apple and pear isolates fell in one group (GI) whereas, those isolated from hazelnut cluster in another group (GII) (Figure 1a). Nucleotide identity ranged from 88 to 100% with higher identity (Figure 1b) and lower diversity (Figure 1c) indicated in GI. Nucleotide distance plots for GI and GII were (0.0 to 13.5%) and (13.5 to 19.6%), respectively (Figure 1c).

Fig. 1. (a) Maximum likelihood (ML) tree, (b) two-dimensional nucleotide identity (c), and two-dimensional nucleotide diversity plots showing the relationship among 52 apple mosaic virus (ApMV) isolates. The accession number, name of each isolate, and the country of its origin are shown

Coat Protein Nucleotide Composition Analysis

High frequency of G and A nucleotides were detected in the ApMV CP sequences, with average compositions of 28.84 ± 0.61% and 26.85 ± 0.54% (Table S2) respectively, in comparison with T (U) (24.48 ± 0.84%) and C (19.80 ± 0.72%). In contrast, the nucleotide composition was remarkably different for the nucleotide compositions at the 3rd position of synonymous codons. The most frequent nucleotide was G3s (31.90% ± 1.64), followed by T3s (28.59% ± 1.76), C3s (21.01% ± 1.57) and A3s (18.49% ± 1.68). The compositions of AU and GC in the CP coding sequences were 51.34% ± 1.22 and 48.65% ± 1.22, respectively, informing that there is an AU-biased composition in the ApMV CP gene. The mean GC contents for GC1,2s and GC3s at 1st, 2nd, and 3rd positions were 46.53 ± 0.53% and 51.02 ± 0.02%, respectively.

Substantial variation in codon usage by ApMV CP gene

RSCU analysis was done for estimating the codon usage patterns of the ApMV CP sequences (Table 1). Twelve out of 18 frequently used codons were G/U-ending (6 ended to G and 6 ended to U), while the six remaining codons were ended to A or C (Table 1). This result indicates that U- and G-ending codons are favored in the ApMV CP gene. Regardless of the ApMV host, the RSCU value > 1.6 was detected for nine of the optimal synonymous codons (UUG, GUG, AGU, CCG, ACG, GCU, CAA, AGG, and GGU), with the highest preferred value for AGU codon (2.57). The variation of the codon usage bias across ApMV CP gene was calculated for the RSCU of each codon for each ApMV isolate and the results indicated three main clusters of codons (Figure 2). The first cluster generally included overrepresented codons (RSCU > 1), which contained A/U-ending codons (19 out of 59 codons) and G/C-ending codons (14 out of 59 codons). The second cluster consisted of mostly G/C-ending codons (11 out of 59 codons) and six codons ended to A/U that were generally underrepresented (RSCU < 1). The last and the smallest group consisted of five A/U-ending codons (UCA, GUA, GCA, CUU, and UAU) and four G/C ending codons (CGG, CUC, UGC, and CAG) that were underrepresented. Among the underrepresented codons, two UCA and UCG codons, which encode serine were found in most of the hazelnut isolates.

The relative synonymous codon usage value of 59 codons encoding 18 amino acids in the coat protein gene of apple mosaic virus according to hosts

Codon aa Apple Pear Hazelnut All
UUU F 1.04* 1.05 1.11 1.07
UUC F 0.96 0.95 0.89 0.93
UUA L 1.29 2.03 1.13 1.48
UUG L 2.26 2.03 2.38 2.22
CUU L 0.74 0.44 0.62 0.60
CUC L 0.09 0.04 0.06 0.06
CUA L 0.51 0.75 0.62 0.63
CUG L 1.12 0.71 1.19 1.01
AUU I 1.00 1.01 0.89 0.97
AUC I 0.9 0.89 1.00 0.93
AUA I 1.1 1.09 1.11 1.10
GUU V 0.86 1.24 0.81 0.97
GUC V 1.1 0.63 1.06 0.93
GUA V 0.17 0.36 0.22 0.25
GUG V 1.87 1.77 1.92 1.85
UCU S 1.16 0.85 1.20 1.07
UCC S 1.51 1.25 1.63 1.46
UCA S 0.03 0.23 0.09 0.12
UCG S 0.11 0.45 0.00 0.19
AGU S 2.55 2.66 2.49 2.57
AGC S 0.64 0.57 0.60 0.60
CCU P 0.79 0.91 0.85 0.85
CCC P 0.45 0.32 0.46 0.41
CCA P 0.64 1.03 0.64 0.77
CCG P 2.13 1.74 2.05 1.97
ACU T 1.13 0.86 1.20 1.06
ACC T 0.46 0.52 0.46 0.48
ACA T 0.73 0.67 0.80 0.73
ACG T 1.69 1.95 1.54 1.73
GCU A 1.42 1.82 1.69 1.64
GCC A 1.26 1.27 1.16 1.23
GCA A 0.47 0.22 0.31 0.33
GCG A 0.84 0.69 0.84 0.79
UAU Y 0.77 0.37 0.73 0.62
UAC Y 1.23 1.63 1.27 1.38
CAU H 0.49 0.39 0.43 0.44
CAC H 1.51 1.61 1.57 1.56
CAA Q 1.48 1.89 1.52 1.63
CAG Q 0.52 0.11 0.48 0.37
AAU N 1.34 1.38 1.42 1.38
AAC N 0.66 0.63 0.58 0.62
AAA K 0.54 0.59 0.56 0.56
AAG K 1.46 1.41 1.44 1.44
GAU D 1.21 1.52 1.24 1.32
GAC D 0.79 0.48 0.76 0.68
GAA E 0.94 1.15 0.98 1.02
GAG E 1.06 0.85 1.02 0.98
UGU C 0.57 1.04 0.50 0.70
UGC C 1.43 0.96 1.50 1.30
CGU R 0.47 0.62 0.43 0.51
CGC R 0.28 0.62 0.38 0.43
CGA R 1.49 0.98 1.39 1.29
CGG R 0.01 0.00 0.00 0.00
AGA R 1.37 1.73 1.39 1.50
AGG R 2.37 2.04 2.41 2.27
GGU G 1.88 1.97 2.13 1.99
GGC G 0.79 0.54 0.84 0.72
GGA G 0.92 1.11 0.80 0.94
GGG G 0.41 0.37 0.23 0.34

*The most frequently used codons are shown in bold.

Fig. 2. The relative synonymous codon usage (RSCU) value of each codon in the coat protein (CP) gene of each ApMV isolate is shown. Rows indicate the 59 non-degenerate, non-stop codons. Columns represent the 52 ApMV isolates. Over-represented codons (RSCU > 1) are shown by blue cells and under-represented codons (RSCU < 1) by red cells. Codons were grouped into three main clusters on the basis of RSCU values: under-represented A/U- and G/C-ending codons (cluster I, grey bar), under-represented codons mostly ending in G/C (cluster II, orange bar), and over-represented codons mostly ending in A/U (cluster III, green bar)

High genomic stability and low CUB of the ApMV CP gene

The importance of the ApMV CP codon usage bias was measured by ENC value. Low codon usage bias in all CP coding sequences of the ApMV with ENC average value 54.46 ± 2.04 (Table S2), represents an approximately constant and conserved genomic composition. However, the highest and lowest ENC values were indicated for the ApMV CP coding sequences of isolates from apple and hazelnut hosts, respectively (Figure 3).

Fig. 3. Effective number of codon (ENC) values of the CP gene of ApMV isolates from different hosts. Apple, pear, and hazelnut hosts are shown in orange, green, and blue dots, respectively

Trends in codon usage variation

The significant tendency in codon usage variation of the ApMV CP gene was examined by PCA analysis (Figure 4a). Among the three various hosts, several overlaps were detected between apple and pear isolates suggesting that the main codon usage trend is somewhat identical in these two hosts (Figure 4a). In addition, the principal axes are plotted according to the geographical locations of ApMV isolates (Figure 4b). By this analysis, no clustering was found between the isolates and geographical locations, which were isolated (Table S2). Clustering of the majority of ApMV isolates from apple (Figure 4a) near to origin by PCA, illustrated the possible origin of this virus from the apple host.

Fig. 4. Principal component analysis (PCA) based on the RSCU values of all synonymous codons of the ApMV CP gene. Correspondence analysis of codon usage patterns in the ApMV CP gene based on (a) hosts and (b) geographical locations from where the isolates were obtained

ENC and Neutrality Plots analysis

By ENC values against GC3s values, the data points belonging to three hosts clustered together under the normal ENC curve (Figure 5). When the data points drop below the standard curve, the codon usage is more affected by natural selection rather than the mutation pressure. In addition, the degree of mutational pressure and natural selection on the codon usage in ApMV CP gene was determined, using the neutrality analyses between GC1,2s and GC3s for all of the sequences, and the results were grouped by the ApMV hosts (Figure 6). A significant positive correlation (r2 = 0.5022, p = 4.153 × 10-9) was indicated between the GC1,2s and GC3s values for the CP gene. The correlation coefficient indicated that the relative neutrality was 13.02%, i.e., the slopes of linear regression demonstrating mutation pressure accounted for 13.02% whereas natural selection accounted for 86.98%. In another word the relative constraint of GC3s (0% constraint or 100% neutrality) is 86.98%, indicating that selection pressure is dominant over mutation in shaping codon usage bias of ApMV CP gene.

Fig. 5. ENC plot analysis of the ApMV CP sequences with ENC curve drawn against GC3s of three ApMV hosts. The standard curve plotted by using the codon usage bias (calculated by the GC3s composition only) indicated by purple points. Hazelnut, apple, and pear hosts are shown by blue, red, and green dots, respectively
Fig. 6. Neutrality plot analyses of GC3s versus GC1,2s are shown for all ApMV CP sequences. The solid blue line is the linear regression of GC3s versus GC1,2s

Analysis of Parity Rule 2

The PR2-bias plot of the ApMV CP gene is shown in Figure 7. Along the ordinate, in the PR2 plot, all ApMV CP genes showed similar distribution, and all of them were distributed on the lower right area of the plot (the G > C side). The PR2-bias plot indicates a codon usage deviation between G + C and A + T at the 3rd nucleotide position. This unequal use of nucleotides composition with PR2 plot indicates that the combination of mutation pressure and natural selection is driving the codon usage patterns in the CP gene but the role of selection pressure is more important (Figure 7).

Fig. 7. The AU [A3%/(A3% + U3%)] and GC [G3%/(G3% + C3%)] bias of the ApMV CP gene is shown. Hazelnut, apple, and pear hosts are shown by blue, red, and green dots, respectively

Codon usage and host adaptation

The CAI and RCDI analyses were done for assessment of the codon usage optimization and host adaptation of ApMV. The average CAI values of the CP coding sequences were 0.693, 0.678, and 0.630 for the hazelnut, apple, and pear, respectively (Figure 8). These results showed that ApMV host adaptation was highest for hazelnut and minimum for pear. In addition, the average RCDI values were highest for pear (1.975), followed by apple (1.792) and hazelnut (1.715), which shows codon usage deoptimization was the greatest for the pear (Figure 8). The SiD values were also calculated to investigate how the hosts’ codon usage patterns influence the ApMV CP codon usage pattern (Figure 9). The SiD value of hazelnut was greater than those of apple and pear suggesting that hazelnut had a higher influence on the ApMV CP gene in comparison with apple and pear.

Fig. 8. The codon adaptation index (CAI) and relative codon deoptimization index (RCDI) analyses of the ApMV CP gene sequence with respect to natural hosts. The x-axis shows sequences determined in various hosts
Fig. 9. The similarity index (SiD) of the ApMV CP gene for the three natural hosts, hazelnut, apple, and pear, is shown in blue, orange, and green colors, respectively. The x-axis shows sequences identified in distinct hosts

Identification of codon usage patterns provides important information about the host-pathogen co-evolution, such as adaptation of pathogens to hosts and molecular evolution of genes (Butt et al. 2016; He et al. 2019; Pandit and Sinha 2011; Zhang et al. 2019). In comparison with eukaryotic and prokaryotic organisms, the importance of CUB in the evolution of plant viruses is less considered. In this study, we analyzed synonymous codon usage in CP sequences from 52 ApMV in order to understand its molecular evolution under the influence of multiple viral and host factors. It has previously been indicated that codon usage bias, or preference for one type of codon over another, can be significantly influenced by overall genomic composition (Jenkins and Holmes 2003). Nucleotide composition analysis indicated that the ApMV CP gene was AU rich. However, it appears that codons with U or G in the third position are preferred in the ApMV CP gene, which indicates possible codon usage bias (Table 1 and Figure 1). The uneven usage of A3/U3 and G3/C3 nucleotides in AU-rich CP genes in this study shows that the compositional patterns of the ApMV CP sequences are more complex than the commonly observed GC- and/or AU-rich compositions of most virus genes. This unequal use of nucleotides indicates the overlapping influences of mutational pressure and natural selection on the codon preferences in the present CP gene sequences as previously reported for Citrus tristeza virus (CTV) (Biswas et al. 2019). The RSCU analysis also indicated that ApMV coding sequences exhibit a strong preference for G- and U-ending codons in the codon usage patterns (Table 1).

According to the existence of codon bias toward G and U ended codons in ApMV CP gene sequences, we analyzed this bias between different hosts of ApMV using ENC analyses. Generally, the stronger codon usage bias is indicated by a smaller ENC value and the ENC values less than 35 are illustrated for genes with considerable codon bias. For this case, the mean ENC value was 54.46 (Table S2), which shows slightly biased, relatively conserved, and stable coding sequences composition among different isolates. In addition, among the three hosts, those isolates from apples with higher mean ENC values showed a lower codon usage bias than isolates from hazelnut and pear (Figure 3). The low codon usage bias has been previously reported for some plant viruses including Begomoviruses (Xu et al. 2008), Papaya ringspot virus-PRSV (Chakraborty et al. 2015), Citrus tristeza virus-CTV (Biswas et al. 2019), Rice stripe virus-RSV (He et al. 2017), and Potato virus M-PVM (He et al. 2019) as well. The low CUB might be beneficial to ApMV on its fitness to the host species with potentially distinct codon preferences. In RNA virus population, faster replicators are preferred as the virus shares a common resource with the host for their translational machinery (Elena and Sanjuán 2005). As the RNA-dependent RNA polymerase (RdRP) lacks the 3’-5’ proof-reading activity, a high replication rate sometimes decreases the population fitness by introducing deleterious mutations in the viral genome (Elena and Sanjuán 2005). A lower replication rate increases the fidelity, which leads to better fitness of the virus population. Thus, a low CUB of RNA viruses has an advantage for efficient replication in the host cells by reducing the competition between the virus and host in using the synthesis machinery (Jenkins and Holmes 2003).

The significant tendency in codon usage variation of the ApMV CP gene was examined by PCA analysis on RSCU values (Figure 4a). PCA analysis among three various hosts indicated several overlaps between apple and pear isolates which suggests that the main codon usage trend is identical in these two hosts. In addition, we plotted principal axes based on hosts and geographical isolation. Clustering of the majority of ApMV isolates from apple (Figure 4a) near to origin by PCA plot, illustrating the possible origin of this virus from apple host. In contrast, hazelnut isolates might have independently evolved due to biological variation and/or dispensation diversities. ApMV was firstly isolated and described from apple in the early 1940s (Bradford and Joly 1933) and later form other hosts including pear and hazelnuts (Brunt et al. 1996). Based on Principal component analysis (PCA) plots, it was inferred that ApMV originated from apple trees in Europe continent (Figure 4b, Figure S1). Apple trees are cultivated worldwide and are the most widely grown species in the genus Malus. Apple tree originated in Central Asia, where its wild ancestor, Malus sieversii, is still found today. Apples have been grown for thousands of years in Asia and Europe and were brought to North America by European colonists. The grouping of various ApMV isolates (Figure 1a and 1b), separated by thousands of miles within a single group indicated an important role of the mobility of ApMV’s natural host. This analysis show that the isolates of ApMV might have independently evolved in two clusters after diverging from a common ancestor. In addition, the role of natural hosts within area of infection, and susceptibility of hosts may have affected codon usage patterns in ApMV CP gene. Beside the composition frequencies of nucleotides, the ENC plot is considered to identify codon usage differentiation among genes in various organisms (Comeron and Aguadé 1998). After the ENC and GC3s values of ApMV CP gene were plotted, none of the isolates fell on the standard continuous curve (Figure 5) indicating that selection pressure is the major factor for driving the codon usage bias in CP gene of ApMV. Using neutral plot (GC3s versus GC1,2s values) the effects of mutation pressure and natural selection bias on codon usage patterns are determined showing that influence of natural selection dominates over mutation pressure (Figure 6). It was shown that mutational pressure has a major role in the CUB of plant viruses (Adams and Antoniw 2003). However, the present study show that both the natural selection and mutational pressure have influence on the CUB in plant viruses (Chakraborty et al. 2015).

It has been proposed that if mutation pressure alone influenced the synonymous codon usage bias, therefore the frequency of nucleotides A and U/T should be equal to that of C and G at the synonymous codon third position (Wang et al. 2016). Using PR2 plot analysis it was indicated that the frequency of GC and AU nucleotides at the third position of synonymous codon was not equal (Figure 7). The AU bias in ApMV CP gene demonstrates the potential influence of natural selection on codon usage patterns. The pathogen-host interactions can affect the dynamics, emergence, genetic divergence, and evolution of infectious diseases (Wang et al. 2016; Zhang et al. 2013). CAI is considered as an index of gene expression and can be used to evaluate the adaptation of viral genes to their hosts. The highest CAI value was calculated for hazelnut indicating that natural selection from these hosts has influenced the codon usage patterns (Figure 8). As inferred from the SiD analysis (Figure 9) hazelnut has a more effect on shaping ApMV RSCU patterns, which is in accord with the CAI analysis.

Although apple has always been suggested to be the primary ApMV host however, a strong link between ApMV and hazelnut was observed in this study. Based on our findings, this study showed that overall codon usage within the ApMV CP gene is slightly biased. The evolution of ApMV perhaps reflects a dynamic process of mutation and natural selection to adapt their codon usage to different environments and hosts. This study reflects an essential contribution to the understanding of plant virus evolution, reveals novel information about their evolutionary fitness of them, and helps find better management strategies of ApMV.

Supplemental Materials

We gratefully thank to Dr. Hassan Ebrahimi, Department of Advanced Technology Fusion, Graduate School of Science and Engineering, Saga University, 1 Honjo-Machi, Saga 840-8502, Japan, for kindly supporting CodonW analysis and mathematical equations in this research.


The Iranian Research Institute of Plant Protection-IRIPP (Project No. 961463) partly supported this work.

Conflicts of interest/Competing interests

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Availability of data and material

The data sets analyzed during the present study are available in the GenBank repository (

Code availability

No custom or special code or mathematical algorithm was used this study.

Authors’ contributions

All authors contributed equally.

Ethics approval

The authors confirm that the ethical policy of the journal, as mentioned on the journal’s author guidelines page, was ensured, and no ethical approval was required for this study as no samples or questionnaires were collected from animals or humans.

Consent to participate

Written informed consent for study participation was obtained from all individual participants.

Consent for publication

Written informed consent for study publication was obtained from all individual participants.

  1. Adams MJ, Antoniw JF (2003) Codon usage bias amongst plant viruses. Archives of Virology 149: 113-135
    Pubmed CrossRef
  2. Angellotti MC, Bhuiyan SB, Chen G, Wan XF (2007) CodonO: Codon usage bias analysis within and across genomes. Nucleic Acids Research 35: 132-136
    Pubmed KoreaMed CrossRef
  3. Athey J, Alexaki A, Osipova E, Rostovtsev A, Santana-Quintero LV, Katneni U, Simonyan V, Kimchi-Sarfaty C (2017) A new and updated resource for codon usage tables. BMC Bioinformatics 18: 391
    Pubmed KoreaMed CrossRef
  4. Biswas K, Palchoudhury S, Chakraborty P, Bhattacharyya U, Ghosh D, Debnath PDebnath P et al (2019) Codon usage bias analysis of Citrus tristeza virus: Higher codon adaptation to citrus reticulata host. Viruses 11: E331
    Pubmed KoreaMed CrossRef
  5. Bradford FC, Joly L (1933) Infectious variegation in the apple. Journal of Agricultural Research 46: 901-908
  6. Brunt AA, Crabtree K, Dallwitz MJ, Gibbs AJ, Watson L (1996). Viruses of plants: Description and lists from the VIDE database. CAB International, Wallingford, UK, pp 1484
  7. Bujarski J, Figlerowicz M, Gallitelli D, Roossinck MJ, Scott SW (2012) Family Bromoviridae. In: Bogorad K. Vasil (ed)Walker JM (ed)King AMQ, Adams MJ, Carestens EB, Lefkowitz EJ (eds) Virus Taxonomy Classification and Nomenclatureof Viruses Ninth Report of The International Committee on Taxonomy of Viruses. Elsevier, San Diego, pp 965-976
  8. Bulmer M (1991) The selection-mutation-drift theory of synonymous codon usage. Genetics 12: 897-907
    Pubmed KoreaMed CrossRef
  9. Butt AM, Nasrullah I, Qamar R, Tong Y (2016) Evolution of codon usage in Zika virus genomes is host and vector specific. Emergeing Microbes & Infections 5: 1-14
    Pubmed KoreaMed CrossRef
  10. Butt AM, Nasrullah I, Tong Y (2014) Genome-wide analysis of codon usage and influencing factors in chikungunya viruses. PLoS ONE 9
    Pubmed KoreaMed CrossRef
  11. Callaway A, Giesman-Cookmeyer D, Gillock ET, Sit TL, Lommel SA (2001) The multifunctional capsid proteins of plant RNA viruses. Annual Review of Phytopathology 39: 419-460
    Pubmed CrossRef
  12. Chakraborty P, Das S, Saha B, Sarkar P, Karmakar A, Saha ASaha A et al (2015) Phylogeny and synonymous codon usage pattern of Papaya ringspot virus coat protein gene in the sub-Himalayan region of north-east India. Canadian Journal of Microbiology 61: 555-564
    Pubmed CrossRef
  13. Cieslinska M, Valasevich N (2016) Characterization of Apple mosaic virus isolates detected in hazelnut in Poland. Journal of Plant Disease and Protection 123: 187-192
  14. Comeron JM, Aguadé M (1998) An Evaluation of measures of synonymous codon usage bias. Journal of Molecular Evolution 47: 268-274
    Pubmed CrossRef
  15. Crowle DR, Pethybridge SJ, Leggett GW, Sherriff LJ, Wilson CR (2003) Diversity of the coat protein-coding region among Ilarvirus isolates infecting hop in Australia. Plant Pathology 52: 655-662
  16. Davino S, Panno S, Arrigo M, La Rocca M, Caruso AG, Lo Bosco G (2017) Planthology: an application system for plant diseases management. Chemi Engineering Transactions 58: 619-624
  17. Elena SF, Fraile A, Garcia-Arenal F (2014) Evolution and emergence of plant viruses. Advances in Virus Research 88: 161-191
    Pubmed CrossRef
  18. Elena SF, Sanjuán R (2005) Adaptive value of high mutation rates of RNA viruses: Separating causes from consequences. Journal of Virology 79: 11555-11558
    Pubmed KoreaMed CrossRef
  19. Fulton RW (1972). Apple mosaic virus. CMI/AAB Descriptions of plant viruses. No.83
  20. Garcia-Arenal F, Fraile A, Malpica JM (2001) Variability and genetic structure of plant virus populations. Annual Review of Phytopathology 39: 157-186
    Pubmed CrossRef
  21. Grimova L, Winkowska L, Rynek P, Svoboda P, Petrzik K, Petrzik K (2013) Reflects the coat protein variability of apple mosaic virus host preference?. Virus Genes 47: 119-125
    Pubmed CrossRef
  22. He M, Guan SY, He CQ (2017) Evolution of rice stripe virus. Molecular Phylogenetic and Evolution 109: 343-350
    Pubmed CrossRef
  23. He Z, Gan H, Liang X (2019) Analysis of synonymous codon usage bias in Potato virus M and its adaption to hosts. Viruses 11: E752
    Pubmed KoreaMed CrossRef
  24. Hershberg R, Petrov DA (2008) Selection on codon bias. Annual Review of Genetics 42: 287-299
    Pubmed CrossRef
  25. Jenkins GM, Holmes EC (2003) The extent of codon usage bias in human RNA viruses and its evolutionary origin. Virus Research 92: 1-7
    Pubmed CrossRef
  26. Kumar S, Stecher G, Li M, Knyaz C, Tamura K (2018) MEGA X: Molecular evolutionary genetics analysis across Computing Platforms Sudhir. Molecular Biology and Evolution 35: 1547-1549
    Pubmed KoreaMed CrossRef
  27. Liu XS, Zhang YG, Fang YZ, Wang Y-Lu (2012) Patterns and influencing factor of synonymous codon usage in porcine circovirus. Virology Journal 9: 1-9
    Pubmed KoreaMed CrossRef
  28. Pandit A, Sinha S (2011) Differential trends in the codon usage patterns in HIV-1 genes. PLoS ONE 6
    Pubmed KoreaMed CrossRef
  29. Petrzik K (2005) Capsid protein sequence gene analysis of Apple mosaic virus infecting pears. European Journal of Plant Pathology 111: 355-360
  30. Puigbò P, Aragonès L, Garcia-Vallvé S (2010) RCDI/eRCDI: A web-server to estimate codon usage deoptimization. BMC Research Notes 3: 87
    Pubmed KoreaMed CrossRef
  31. Sharp PM, Li WH (1986) An evolutionary perspective on synonymous codon usage in unicellular organisms. Journal of Molecular Evolution 24: 28-38
    Pubmed CrossRef
  32. Sharp PM, Tuohy TMF, Mosurski KR (1986) Codon usage in yeast: Cluster analysis clearly di_erentiates highly and lowly expressed genes. Nucleic Acids Research 14: 5125-5143
    Pubmed KoreaMed CrossRef
  33. Sueoka N (1999) Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position. Gene 238: 53-58
    Pubmed CrossRef
  34. Tzanetakis IE, Martin RR (2005) First Report of strawberry as a natural host of Apple mosaic virus. Plant Disease 89: 431
    Pubmed CrossRef
  35. Wang H, Liu S, Zhang B, Wei W (2016) Analysis of synonymous codon usage bias of Zika virus and its adaption to the hosts. PLoS ONE 11
    Pubmed KoreaMed CrossRef
  36. Wright F (1990) The 'effective number of codons' used in a gene. Gene 87: 23-29
    Pubmed CrossRef
  37. Xu X, Liu Q, Fan L, Cui X, Zhou X (2008) Analysis of synonymous codon usage and evolution of begomoviruses. Journal of Zhejiang University of Science and Technology 9: 667-674
    Pubmed KoreaMed CrossRef
  38. Zhang W, Zhang L, He W, Zhang X, Wen B, Wang CWang C et al (2019) Genetic evolution and molecular selection of the HE gene of Influenza C virus. Viruses 11: E167
    Pubmed KoreaMed CrossRef
  39. Zhang Z, Dai W, Wang Y, Lu C, Fan H (2013) Analysis of synonymous codon usage patterns in torque teno sus virus 1 (TTSuV1). Archives of Virology 158: 145-54
    Pubmed KoreaMed CrossRef
  40. Zhou J, Liu WJ, Peng SW, Sun XY, Frazer I (1999) Papillomavirus capsid protein expression level depends on the match between codon usage and tRNA availability. Journal of Virology 73: 4972-4982
    Pubmed KoreaMed CrossRef

June 2022, 49 (2)
Full Text(PDF) Free
Supplementary File

Social Network Service

Cited By Articles
  • CrossRef (0)

Funding Information
  • CrossMark
  • Crossref TDM