Chinese Journal of Oceanology and Limnology   2016, Vol. 34 issue(6): 1258-1268     PDF       
http://dx.doi.org/10.1007/s00343-016-5129-7
Institute of Oceanology, Chinese Academy of Sciences
0

Article Information

YANG Feifei(杨菲菲), XU Donghui(徐东会), ZHUANG Yunyun(庄昀筠), HUANG Yousong(黄有松), YI Xiaoyan(衣晓燕), CHEN Hongju(陈洪举), LIU Guangxing(刘光兴), ZHANG Huan(张寰)
Characterization and analysis of ribosomal proteins in two marine calanoid copepods
Chinese Journal of Oceanology and Limnology, 34(6): 1258-1268
http://dx.doi.org/10.1007/s00343-016-5129-7

Article History

Received Apr. 20, 2015
accepted for publication Jul. 20, 2015
accepted in principle Aug. 20, 2015
Characterization and analysis of ribosomal proteins in two marine calanoid copepods
YANG Feifei(杨菲菲)1,2, XU Donghui(徐东会)2, ZHUANG Yunyun(庄昀筠)2, HUANG Yousong(黄有松)1, YI Xiaoyan(衣晓燕)1,2, CHEN Hongju(陈洪举)1,2, LIU Guangxing(刘光兴)1,2, ZHANG Huan(张寰)1,2,3        
1 Key Laboratory of Marine Environment and Ecology(Ocean University of China), Ministry of Education, Qingdao 266100, China;
2 College of Environmental Science and Engineering, Ocean University of China, Qingdao 266100, China;
3 Department of Marine Sciences, University of Connecticut, Groton, Connecticut 06340, USA
ABSTRACT: Copepods are among the most abundant and successful metazoans in the marine ecosystem. However, genomic resources related to fundamental cellular processes are still limited in this particular group of crustaceans. Ribosomal proteins are the building blocks of ribosomes, the primary site for protein synthesis. In this study, we characterized and analyzed the cDNAs of cytoplasmic ribosomal proteins (cRPs) of two calanoid copepods, Pseudodiaptomus poplesia and Acartia pacifica. We obtained 79 cRP cDNAs from P. poplesia and 67 from A. pacifica by cDNA library construction/sequencing and rapid amplification of cDNA ends. Analysis of the nucleic acid composition showed that the copepod cRP-encoding genes had higher GC content in the protein-coding regions (CDSs) than in the untranslated regions (UTRs), and single nucleotide repeats (>3 repeats) were common, with "A" repeats being the most frequent, especially in the CDSs. The 3'-UTRs of the cRP genes were significantly longer than the 5'-UTRs. Codon usage analysis showed that the third positions of the codons were dominated by C or G. The deduced amino acid sequences of the cRPs contained high proportions of positively charged residues and had high pI values. This is the first report of a complete set of cRP-encoding genes from copepods. Our results shed light on the characteristics of cRPs in copepods, and provide fundamental data for further studies of protein synthesis in copepods. The copepod cRP information revealed in this study indicates that additional comparisons and analysis should be performed on different taxonomic categories such as orders and families.
Key words: amino acid composition     codon usage     copepod     nucleotide composition     ribosomal protein    
1 INTRODUCTION

Copepods are among the most abundant metazoans in the marine ecosystem (Humes, 1994 ; Kiørboe, 2011). They are important links in aquatic food webs because they transfer material and energy from primary producers to higher trophic levels (Miller and Wheeler, 2004). Copepods have small body size, short life-cycles, diverse life strategies, and are sensitive to environmental changes. These characteristics make them excellent candidates for studying marine biological processes and as indicators of climate change and anthropogenic activities. Despite the large quantity of information obtained from extensive studies of their morphology, physiology, and ecology (Humes, 1994 ; Rhee et al., 2009 ; Blanco-Bercial et al., 2011), the molecular machinery of fundamental cellular processes in copepods is still elusive.

Protein biosynthesis is the basis of growth, development, reproduction, and other physiological activities. This process occurs in ribosomes, which are one of the most ancient and conserved structures in all living organisms (Harris et al., 2003). The eukaryotic cytoplasmic ribosome typically consists of 79 ribosomal proteins (RPs) and four ribosomal RNAs (rRNAs)(Wilson and Cate, 2012), with 32 RPs distributed on the small subunit (40S) and 47 on the large subunit (60S) of the ribosome (Lecompte et al., 2002). Cytoplasmic RPs (cRPs) carry net positive charges, possess a higher proportion of basic residues (e.g. lysine and arginine), and interact with negatively charged rRNA (Ishii et al., 2006 ; Lott et al., 2013). The cRPs play diverse functions in ribosome assembly, including bridging between two subunits, interacting with transfer RNA (tRNA), and enclosing polypeptide exit channels (Lecompte et al., 2002). The cRPs also have “extra-ribosomal” functions in regulating transcription, cell proliferation, and apoptosis (Lindström, 2009 ; Warner and McIntosh, 2009). The differential expression patterns of cRPencoding genes are considered as a physiological response to the environmental stress. Upregulation of cRP genes has been observed in yeast Siniperca chuatsi under nutrient-replete conditions, while downregulation of cRP genes has been associated with multiple physical and/or chemical stressors (e.g., osmotic stress, hydrogen peroxide) and the stationary phase (Powers and Walter, 1999 ; Warner, 1999 ; Causton et al., 2001).

Despite the fundamental and diverse functions of the cRPs, little study has been done on the cRPs of copepods. Barreto and Burton (2013) carried out a systematic comparison of evolution rates between cRPs and mitochondrial RPs for two populations of Tigriopus californicus. However, this study focused on phylogenetic aspects and basic information about the cRPs (e.g., primary structure and characteristics) was not reported. The lack of integrated fundamental data on copepod cRPs limits our understanding of the mechanism of protein synthesis and the potential use of cRPs as indicators of environmental changes. In this study, we obtained the full-length cDNAs of 79 cRP-encoding genes from the copepod Pseudodiaptomus poplesia and 67 from Acartia pacifica, indicating that copepods possess all the typical eukaryotic cRPs. To systematically characterize the copepod RPs, we analyzed their nucleotide acid and deduced amino acid compositions, as well as codon usage in the cRP genes. This is the first report of a complete set of cRP full-length cDNAs for a single copepod species. Our study lays the foundation for further studies of RPs, ribosomes, protein synthesis, and other basic molecular mechanisms in copepods.

2 MATERIAL AND METHOD 2.1 Copepod collection and preservation

Copepods were collected using a 500-μm-mesh plankton net from Dingzi Bay, Shandong Province (36.58°N, 120.91°E), China in May 2012. The net contents were transferred into a 10-L plastic bucket with natural seawater from the sampling location and transported immediately into the laboratory where the harvested copepod species were sorted as reported previously (Zhang et al., 2013). Because P. poplesia (Calanoida, Pseudodiaptomidae) and A. pacifica (Calanoida, Acartiidae) are common copepods found in high abundance in the coastal waters of China, the two species were identified in the laboratory and isolated from the collection using a dissecting scope. Fifty to one hundred individuals were immersed immediately in 1.5-mL microcentrifuge tubes containing 1 mL TRIzol reagent (Invitrogen, San Diego, CA, USA) and stored at -80℃ until analysis.

2.2 RNA isolation, cDNA library construction, and sequencing

Total RNA was extracted using a Direct-zol™ RNA MiniPrep Kit (Zymo Research, Orange, CA, USA) following Zhang et al.(2013). RNA quality and quantities were measured by NanoDrop 2000(Thermo Fisher Scientific, Pittsburgh, PA, USA). First-strand cDNA was synthesized from 0.5 to 1.0 μg of total RNA using modified oligo dT (Table 1) and Improm II reverse transcriptase (Promega, Madison, WI, USA). The resultant first-strand cDNA was then purified using a Zymo DNA Clean and Concentrator Kit (Zymo Research).

Table 1 Primers used in this study

Second-strand cDNA was synthesized using the primer set CopepodSL (forward)-Racer 3(reverse)(Table 1) and Hot Start Ex Taq polymerase (TaKaRa, Kyoto, Japan) in a 25 μL reaction system. The reaction conditions were as follows: 95℃ for 2 min; 10 cycles of 95℃ for 15 sec, 68℃ for 4 min; 20 cycles of 95℃ for 15 sec, 60℃ for 30 sec, 72℃ for 3 min; with an additional extension at 72℃ for 7 min. The final products were purified and cloned into a T-vector as reported previously (Zhang et al., 2007). About 900 clones of each species were picked randomly and sequenced (Shenzhen Huada Company, Shenzhen, China).

2.3 Sequence analysis

The raw sequence reads were checked manually to remove vector and low-quality sequences. Full-length cDNA sequences were obtained by identifying both the forward and reverse primers.

The unique sequences were compared to sequences in the GenBank nonredundant protein sequence database (nr) using BLASTX with an E value <10-6. The cDNAs were annotated according to the gene description of the most significant hit. Open reading frames (ORFs) were identified using ORF Finder (http://www.ncbi.nlm.nih.gov/gorf/) based on the frame in the aligned sequences in the BLASTX results. The obtained nucleotide sequences were then translated into amino acid sequences. The length and GC content analysis of the nucleotide sequences excluded the CopepodSL and poly A tail.

2.4 PCR amplification of P. poplesia cRP cDNAs missed in the sequenced clones

By querying the Ribosomal Protein Gene Database (http://ribosome.med.miyazaki-u.ac.jp/), we found that the P. poplesia RPs, RPSA, RPS2, RPS4, RPS27, RPL3, RPL4, RPL7A, RPL10A, and RPL31, were absent from the corresponding cDNA library. To obtain these cRP cDNAs of P. poplesia, homologs of these cRPs from other copepods or closely relative species were retrieved by querying the GenBank nr database. Degenerate primers for specific genes (GSP, Table 1) were designed based on their conserved regions. These primers were used in nested PCR amplifications with the first-strand cDNAs of P. poplesia as templates. For 5′-rapid amplification of cDNA ends (5′-RACE), the first PCR was run using the CopepodSL and GSPR1 primer set, and the second PCR was run using the CopepodSL and GSPR2 primer set. For 3′-RACE, the first PCR product of P. poplesia was amplified using GSPF1 with Racer 3, and the second PCR was run using GSPF2 with Racer 3. The PCR reactions were carried out under the following conditions: 95℃ for 1 min; 5 cycles of 95℃ for 15 sec, 50℃ for 30 sec, 72℃ for 30 sec; 25 cycles of 95℃ for 15 sec, 55℃ for 30 sec, 72℃ for 30 sec; and 72℃ for 7 min. The PCR products were purified and sequenced. Then, the cDNA of a specific cRP gene was assembled manually according to the overlapping regions of the sequences.

2.5 Codon usage bias analysis

As a general practice to minimize sampling error, only genes with complete protein coding sequences (CDSs) of at least 100 codons long are used for codon usage analysis. However, because some cRPs contain less than 100 amino acids, we set the length limit to at least 50 amino acids (Grocock and Sharp, 2002). The codon usage bias of the CDSs of the cRP genes from A. pacifica and P. poplesia was analyzed with CodonW version 1.4.4(http://mobyle.pasteur.fr/cgibin/portal.py?#forms::CodonW).

For each gene, indices of codon bias were calculated as G+C composition in the third position of the codons (GC3s), effective number of codons (Nc)(Wright, 1990), and relative synonymous codon usage (RSCU)(Sharp and Li, 1986). GC3s is the frequency of guanine and cytosine at the third coding position. Nc measures the bias of synonymous codons and ranges from 20(when each amino acid uses only one codon) to 61(when all sense codons of all the amino acids are used randomly). An Nc versus GC3s plot was used to discover codon usage differences among the cRP genes in the two copepods (Wright, 1990). The expected Nc value was calculated as follows: Nc=2+ s +{29/[ s 2 +(1- s)2 ]}, where s =GC3s value (Peden, 1999). RSCU is the ratio of the actual codon usage to the expected when all codons of the same amino acid are used randomly (Sharp and Li, 1986). If the RSCU value is greater than 1.0, the frequency of a particular codon is more than the expected frequency. Significant test and correlation analysis were performed with the SPSS statistical software (version 17.0, SPSS Inc., USA).

3 RESULT 3.1 Identification of cRP genes from two copepods

The unique full-length cDNA sequences were queried against the GenBank nr database with BLASTX and 70 and 67 cRP cDNAs were identified for P. poplesia and A. pacifica. The nine cRP cDNAs not found in sequenced clones of P. poplesia were acquired by RACE (Zhang et al., 2007). The complete set of 79 cRP cDNAs from P. poplesia consisted of 32 40S subunit RPs and 47 60S subunit RPs (Fig. 1). The cDNA sequences of cRP from P. poplesia and A. pacifica were deposited in GenBank under accession numbers KT754746-KT755453 and KT754171-KT754709, respectively. The annotations of each cRP were included in the cDNA library annotation provided in Table S2 of Yang et al.(2015).

Figure 1 Complete set of cytoplasmic ribosomal protein genes (79 in total) in Pseudodiaptomus poplesia a. P. poplesia cRPs (70) identified from the cDNA library; b. cRPs obtained by PCR (9). *RPL44 is homologous to RPL36A in the Ribosomal Protein Gene Database.

Because cRP genes have fairly high transcription rates, random cloning of cDNAs will always result in a high number of sequences encoding cRPs. In our cDNA library of P. poplesia, 24 clones of RPL21, 20 of RPS23, 16 of RPL27a, 16 of RPL44, 14 of RPS28, 14 of RP L37A, 12 of RPL24, 12 of RPL35, 12 of RPS18, 11 of RPS15A, 11 of RPS21, 11 of RPS30, and 11 of ubiquitin/40S ribosomal protein RPS27A fusion protein were found. We also found that multiple sequences could encode the same ribosomal protein with different 3′-untranslated region (UTR) lengths.

3.2 Characteristics of cRP cDNA sequences from two copepods

We analyzed the size distributions of the full-length cDNAs, the CDSs, and the 5′- and 3′-UTRs. For A. pacifica and P. poplesia, the average lengths of the cRP cDNAs were 621±181 bp and 698±210 bp, CDSs 412±185 bp and 463±196 bp, 5′-UTRs 12±13 bp and 14±15 bp, and 3′-UTRs 186±109 bp and 151±98 bp, respectively (Fig. 2). The cRP cDNAs of both copepods had significant longer 3′-UTRs than 5′- UTRs (t-test; P<0.01) while, the lengths of the CDSs from A. pacific and P. poplesia did not vary significantly (t-test; P >0.05).

Figure 2 Size distributions of regions of the cRP cDNAs of P. poplesia (a) and A. pacifica (b) UTR: untranslated region; CDS: protein-coding region.

The mean GC content in the CDSs was 54.5% for P. poplesia and 55.9% for A. pacifica, while the UTRs were AT-rich and had a mean GC content of 38.7%- 43.7% for both P. poplesia and A. pacifica. Single nucleotide repeats (>3 repeats) are common in copepod cRP cDNAs; “A” repeats are most common, followed by “T”, “C”, and “G”. For example, the “A” repeats in RPL27a cDNA occurred up to nine times.

For P. poplesia cRP cDNAs, the upstream sequence flanking the putative start codon AUG was investigated for Kozak-like sequences. This consensus sequence plays an important role in the initiation of the translation process. However, we found a different consensus sequence, AAAAUGGCU (Fig. 3), in the copepod cRP cDNAs rather than the typical ACCAUGGCG found in many other eukaryotes (Kozak, 1986).

Figure 3 Sequence logo showing the probability of nucleotides flanking the start codon AUG in RP cDNAs of P. poplesia

Only about half of the copepod cRP cDNAs contained the consensus polyadenylation signal AAUAAA/AUUAAA, the binding site recognized by the RNA cleavage complex. Similar to the mRNAs of other eukaryotes (Liu et al., 2007), in most cRP transcripts with a polyadenylation signal, poly (A) was added 5-29 nucleotides downstream of the binding site.

3.3 Analysis of the deduced amino acid sequences of the cRPs

We calculated the percentage of each amino acid across all deduced cRP sequences and the physicochemical properties of the deduced proteins were calculated. No significant difference was detected in the amino acid composition between P. poplesia and A. pacifica (t-test; P >0.05). As shown in Fig. 4, the positive residues arginine (Arg) and lysine (Lys) made up the largest proportions of the amino acids in the cRP sequences, at about 9% and 13%, respectively, followed by the aliphatic amino acids glycine (Gly) and alanine (Ala), at about 7% and 8%, respectively. The frequencies of tryptophan (Trp) and cysteine (Cys) were the lowest among all the amino acids for both P. poplesia and A. pacifica. Clearly, the positively charged residues (Arg+Lys) were much more abundant than the negatively charged residues (glutamic acid and aspartic acid, Glu+Asp) in cRPs of these two copepods. As shown in Fig. 5, the percentage of basic positively charged residues increased with declining protein size (Spearman's rank correlation, P<0.05); in other words, smaller proteins tended to have higher percentages of basic residues. In addition, we identified a positive correlation between net protein charge and protein size (Spearman's rank correlation, P<0.05), as well as a positive correlation between acidic charged residues and protein size (Spearman's rank correlation, P<0.05).

Figure 4 Percentage of amino acids in deduced cRP sequences from P. poplesia and A. pacifica
Figure 5 Relationship between percentage of basic amino acid residues and length of the cRP amino acid sequences from P. poplesia and A. pacifica

The isoelectric point (pI) values of the cRPs were mostly within the 10-12 range for both P. poplesia and A. pacifica (Fig. 6). Similar to human and D rosophila melanogaster (Marygold et al., 2007), there were very few acidic cRPs in the two copepods. Only four of the cRPs (RPS12, RPLP0, RPLP1, and RPLP2) in P. poplesia had a predicted pI of less than 7. RPLP1 and RPLP2 were more acidic (pI <5) than RPLP0(pI≈6). Orthologous proteins between P. poplesia and A. pacifica had very similar pI values with differences less than 0.5 in most cases. The exception was RPS12, which was an acidic protein only in P. poplesia (pI=5.78); in A. pacifica, the pI was 8.07. A significant positive correlation was found between the percentage of basic amino acids (Arg and Lys) and the corresponding pI for each cRPs (Spearman's rank correlation, P<0.01). The computed pI values will be a valuable reference for building buffer systems for purification of cRPs using isoelectric focusing.

Figure 6 Computed pI values for cRPs from P. poplesia and A. pacifica
3.4 Overall synonymous codon usage

We used the complete set of cRP full-length cDNAs for P. poplesia and the almost complete set of cRP full-length cDNAs for A. pacifica to analyze codon usage in the common cRP-encoding genes of these two copepods.

The overall codon usage in the cRP cDNAs is shown in Table 2. AUG was the start codon for all the cRP genes while UAA was the most frequently used stop codon, followed by UAG. The CDSs of the cRPs had higher GC content than the UTRs, indicating predominate usage of C or G at the end of the degenerate codons. As shown in Table 2, P. poplesia and A. pacifica had very similar preferences in using degenerate codons, except for the codons for Val and Gly. Overwhelmingly, the majority of the preferentially used degenerate codons ended with C or G, as expected. Among the Arg codons, AGG was about twice as frequent as CGC, while among the Pro codons, CCC was used about eight times more often than CCG. These data indicate that the usage of all the codons cannot be explained only by simple or nearestneighbor dependent mutational bias (Bulmer, 1990).

Table 2 Codon usage of the cRP cDNAs from P. poplesia and A. pacifica

Clear variation in codon usage across all the cRPs cDNAs was detected by the Nc and GC3s values for P. poplesia and A. pacifica (Table 3). The analysis showed that the GC3 content of the CDSs (over 70%) was significantly higher than the overall GC content of the CDS (about 55%) and UTR regions (about 40%; Spearman's rank correlation, P<0.000 1). A significant negative relationship existed between the length of the CDSs and its GC3 content (Spearman's rank correlation, P<0.000 1).

Table 3 Overall GC content and Nc values of cRP cDNAs from P. poplesia and A. pacifica

Wright (1990) suggested that an Nc versus GC3s plot could be used effectively to study the codon usage differences among genes in the same genome. The Nc value of each cRP gene identified in this study was plotted against the corresponding GC3, as shown in Fig. 7. If the synonymous codon bias of a gene is only influenced by G+C composition, the dot should lie on the expected curve, indicating random codon usage. However, if the dot lies substantially below the expected curve, then the gene is may also have been subjected to translational selection (Xu et al., 2013). All the cRP genes of P. poplesia and most of the cRP genes of A. pacifica clustered at the low end of the plot and lay well below the expected curve, indicating a strong codon bias resulting from selection for translational efficiency (Sur et al., 2008).

Figure 7 Nc versus GC3 plot for the cRP cDNA sequences from P. poplesia and A. pacifica The dark solid curve shows the expected Nc values if bias was due to GC3s alone.
4 DISCUSSION

RPs play crucial roles in protein synthesis and have extra-ribosomal functions to maintain cellular homeostasis. We obtained 79 and 67 cRP full-length cDNAs of the copepods, P. poplesia and A. pacifica, respectively.

Amino acid composition analysis showed that the cRPs of the two species contained a high proportion of positively charged residues (mainly Arg and Lys). This result is consistent with previous reports on the cRPs of prokaryotes and mammals (Ishii et al., 2006 ; Burton et al., 2012 ; Lott et al., 2013). A remarkably high frequency of Arg and Lys usage seems to be an exclusive feature of cRPs (Burton et al., 2012 ; Lott et al., 2013). When cRPs interact with rRNA, charge and shape complementarity is more important than sequence-specific interactions (Brodersen et al., 2002). The positively charged residues are crucial in the ribosome assembly process because of their ability to interact with negatively charged rRNA (Lott et al., 2013). A negative correlation was found between positive charges and protein length in the cRPs from A. pacifica and P. poplesia, meaning that smaller proteins tended to have higher proportions of positive charges. Similar correlations have been found in small subunit (30S) RPs from 560 bacterial species (Lott et al., 2013), and this was attributed to high Lys content rather than Arg content. However, in our study, a significant correlation was found between Lys content (rather than Arg content) and protein length in large subunit (60S) RPs, but not in small subunit (40S) RPs. This result indicated that bias in amino acid usage existed in the copepod cRPs, suggesting that chemically equivalent amino acids may have differential functions in the ribosome assembly process. Further study on the chemical characteristics and structures of cRPs will help in understanding amino acid bias among cRPs and may throw light on the differences between prokaryotes and eukaryotes cRPs.

Codon usage analysis is very important for understanding fundamental molecular biology processes (Gupta et al., 2004). Usage bias of synonymous codons varies within genomes and among species (Peden, 1999). The mechanisms involved in this characteristic are still not well understood, but codon usage bias can reveal information about levels of gene expression and molecular evolution processes of species, and is useful in designing species-specific probes or primers (Sørensen and Mortensen, 2005 ; Heitzer et al., 2007 ; RoyChoudhury and Mukherjee, 2010). The codon usage bias of cRPs may represent the codon usage pattern of highly expressed genes (Rispe et al., 2007 ; Kober and Pogson, 2013), which have important effects on the ability of cells to adapt different environments (Kudla et al, 2009 ; Botzman and Margalit, 2011). The pattern of codon usage in cRPs from copepods will add new information to the molecular genetic make-up of this important lineage. The codon adaptation index (CAI) could be used to indicate the expression levels of genes in copepods if the cRPs can be used as references (Ikemura, 1981 ; Sharp and Li, 1987). On the practical side, the information gained from codon usage would aid in the design of gene-specific primers for certain species (Verdoes and van Ooyen, 2000).

The codon usage was found to be very similar between P. poplesia and A. pacifica, two copepods from the same order (Calanoida). With the exception of Val and Gly, P. poplesia and A. pacifica used the same preferred codons for the common cRPs. The overwhelming majority of the preferred codons in the two species had either C or G at the third codon position, consistent with the remarkable high GC3 content in the CDS regions. Some variations were noticed in the RSCU values; for example, among the Ser codons for the cRPs of P. poplesia, UCC was about 10.7 times as frequent as UCG. This suggested that simple mutational bias could not explain the usage of all codons (Bulmer, 1990). The calculated Nc values ranged from about 25 to 56, suggesting that the two copepods exhibited substantial heterogeneity in codon usage. Most of the cRPs clustered at the low end of the Nc versus GC3s plot, which is in agreement with the results reported in Escherichia coli, Xanthomonas, and Azotobacter vinelandii (Wu et al., 2005 ; Sen et al., 2007 ; Sur et al., 2008). Thus, our results provide additional evidence that translational selection was a dominant factor over compositional constraints in shaping codon usage bias in metazoans (Cutter et al., 2003).

Our in-silico analysis of a complete set of cRP fulllength cDNAs provides solid evidence that the general characteristics of copepod cRPs are in agreement with those from other organisms. The sequences obtained in this study could provide basic information for further studies of copepod cRPs and ribosomes, and the interaction of cRPs with rRNA and other macromolecules. On the practical side, the codon usage bias of cRPs can be used to detect highly expressed genes and to design specific primer for copepods.

5 CONCLUSION

In this study, we identified and characterized the complete set of full-length cRP cDNAs in P. poplesia, and 67 cRP cDNAs from A. pacifica (Calanoida). These data provide a new resource to investigate nucleotide and amino acid features and codon usage bias in cRPs, and to examine phylogenetic relationships among copepods. We found that the cRPs of copepods had significantly high levels of positively charged residues, with Arg and Lys making up the largest proportions of these residues. This result indicates the strong impact of electrostatic interactions in the assembly mechanism of ribosomes. The cRP genes of copepods exhibit strong codon usage bias, which could result from factors like base composition and translational selection. Our study provides a solid foundation for further studies of RPs, ribosomes, protein synthesis, phylogeny analysis, and gene expression in copepods. In addition, the methods we used to construct cDNA libraries and obtain cRPs cDNAs absent from the cDNA library for P. poplesia provide a good tool to acquire target genes for copepod transcriptomic studies.

References
Barreto F S, Burton R S, 2013. Evidence for compensatory evolution of ribosomal proteins in response to rapid divergence of mitochondrial rRNA. Mol. Biol. Evol., 30 (2) : 310 –314. Doi: 10.1093/molbev/mss228
Blanco-Bercial L, Bradford-Grieve J, Bucklin A, 2011. Molecular phylogeny of the Calanoida (Crustacea:Copepoda). Mol. Phylogenet. Evol., 59 (1) : 103 –113. Doi: 10.1016/j.ympev.2011.01.008
Botzman M, Margalit H, 2011. Variation in global codon usage bias among prokaryotic organisms is associated with their lifestyles. Genome Biol., 12 (10) : R109 . Doi: 10.1186/gb-2011-12-10-r109
Brodersen D E, Clemons Jr W M, Carter A P, Wimberly B T, Ramakrishnan V, 2002. Crystal structure of the 30S ribosomal subunit from Thermus thermophilus:structure of the proteins and their interactions with 16S RNA. J.Mol. Biol., 316 (3) : 725 –768. Doi: 10.1006/jmbi.2001.5359
Bulmer M, 1990. The effect of context on synonymous codon usage in genes with low codon usage bias. Nucl. Acids Res., 18 (10) : 2 869 –2 873. Doi: 10.1093/nar/18.10.2869
Burton B, Zimmermann M T, Jernigan R L, Wang Y M, 2012. A computational investigation on the connection between dynamics properties of ribosomal proteins and ribosome assembly. PLoS Comput. Biol., 8 (5) : e1002530 . Doi: 10.1371/journal.pcbi.1002530
Causton H C, Ren B, Koh S S, Harbison C T, Kanin E, Jennings E G, Lee T I, True H L, Lander E S, Young R A, 2001. Remodeling of yeast genome expression in response to environmental changes. Mol. Biol. Cell, 12 (2) : 323 –337. Doi: 10.1091/mbc.12.2.323
Cutter A D, Payseur B A, Salcedo T, Estes A M, Good J M, Wood E, Hartl T, Maughan H, Strempel J, Wang B M, Bryan A C, Dellos M, 2003. Molecular correlates of genes exhibiting RNAi phenotypes in Caenorhabditis elegans. Genome Res., 13 (12) : 2 651 –2 657. Doi: 10.1101/gr.1659203
Grocock R J, Sharp P M, 2002. Synonymous codon usage in Pseudomonas aeruginosa PA01. Gene, 289 (1-2) : 131 –139. Doi: 10.1016/S0378-1119(02)00503-6
Gupta S K, Bhattacharyya T K, Ghosh T C, 2004. Synonymous codon usage in Lactococcus lactis:mutational bias versus translational selection. J. Biomol. Struct. Dyn., 21 (4) : 527 –535. Doi: 10.1080/07391102.2004.10506946
Harris J K, Kelley S T, Spiegelman G B, Pace N R, 2003. The genetic core of the universal ancestor. Genome Res., 13 (3) : 407 –412. Doi: 10.1101/gr.652803
Heitzer M, Eckert A, Fuhrmann M, Griesbeck C, 2007. Influence of codon bias on the expression of foreign genes in microalgae. Adv. Exp. Med. Biol., 616 : 46 –53. Doi: 10.1007/978-0-387-75532-8
Humes A G, 1994. How many copepods?. Hydrobiologia, 292 (1) : 1 –7.
Ikemura T, 1981. Correlation between abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes:a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J. Mol. Biol., 151 (3) : 389 –409. Doi: 10.1016/0022-2836(81)90003-6
Ishii K, Washio T, Uechi T, Yoshihama M, Kenmochi N, Tomita M, 2006. Characteristics and clustering of human ribosomal protein genes. BMC Genomics, 7 (1) : 37 . Doi: 10.1186/1471-2164-7-37
Kiørboe T, 2011. What makes pelagic copepods so successful?J. Plankton Res., 33 (5) : 677 –685. Doi: 10.1093/plankt/fbq159
Kober K M, Pogson G H, 2013. Genome-wide patterns of codon bias are shaped by natural selection in the purple sea urchin, Strongylocentrotus purpuratus. G3, 3 (7) : 1 069 –1 083. Doi: 10.1534/g3.113.005769
Kozak M, 1986. Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes. Cell, 44 (2) : 283 –292. Doi: 10.1016/0092-8674(86)90762-2
Kudla G, Murray A W, Tollervey D, Plotkin J B, 2009. Codingsequence determinants of gene expression in Escherichia coli. Science, 324 (5924) : 255 –258. Doi: 10.1126/science.1170160
Lecompte O, Ripp R, Thierry J C, Moras D, Poch O, 2002. Comparative analysis of ribosomal proteins in complete genomes:an example of reductive evolution at the domain scale. Nucleic Acids Res., 30 (24) : 5 382 –5 390. Doi: 10.1093/nar/gkf693
Lindström M S, 2009. Emerging functions of ribosomal proteins in gene-specific transcription and translation. Biochem. Bioph. Res. Co., 379 (2) : 167 –170. Doi: 10.1016/j.bbrc.2008.12.083
Liu D L, Brockman J M, Dass B, Hutchins L N, Singh P, McCarrey J R, MacDonald C C, Graber J H, 2007. Systematic variation in mRNA 3'-processing signals during mouse spermatogenesis. Nucleic Acids Res., 35 (1) : 234 –246.
Lott B B, Wang Y M, Nakazato T, 2013. A comparative study of ribosomal proteins:linkage between amino acid distribution and ribosomal assembly. BMC Biophys., 6 (1) : 13 . Doi: 10.1186/2046-1682-6-13
Marygold S J, Roote J, Reuter G, Lambertsson A, Ashburner M, Millburn G H, Harrison P M, Yu Z, Kenmochi N, Kaufman T C, Leevers S J, Cook K R, 2007. The ribosomal protein genes and Minute loci of Drosophila melanogaster. Genome Biol., 8 (10) : R216 . Doi: 10.1186/gb-2007-8-10-r216
Miller C B, Wheeler P. 2004. Biological Oceanography.Blackwell Publishing, Oxford, UK. p.111-128.
Peden J F, 1999. Analysis of Codon Usage. University of Nottingham, UK.
Powers T, Walter P, 1999. Regulation of ribosome biogenesis by the rapamycin-sensitive TOR-signaling pathway in Saccharomyces cerevisiae. Mol. Biol. Cell, 10 (4) : 987 –1 000. Doi: 10.1091/mbc.10.4.987
Rhee J S, Raisuddin S, Lee K W, Seo J S, Ki J S, Kim I C, Park H G, Lee J S, 2009. Heat shock protein (Hsp) gene responses of the intertidal copepod Tigriopus japonicus to environmental toxicants. Comp. Biochem. Phys. C., 149 (1) : 104 –112.
Rispe C, Legeai F, Gauthier J P, Tagu D, 2007. Strong heterogeneity in nucleotidic composition and codon bias in the pea aphid (Acyrthosiphon pisum) shown by ESTbased coding genome reconstruction. J. Mol. Evol., 65 (4) : 413 –424. Doi: 10.1007/s00239-007-9023-y
RoyChoudhury S, Mukherjee D, 2010. A detailed comparative analysis on the overall codon usage pattern in herpesviruses. Virus Res., 148 (1-2) : 31 –43. Doi: 10.1016/j.virusres.2009.11.018
Sen G, Sur S, Bose D, Mondal U, Furnholm T, Bothra A, Tisa L, Sen A, 2007. Analysis of codon usage patterns and predicted highly expressed genes for six phytopathogenic Xanthomonas genomes shows a high degree of conservation. In Silico Boil., 7 (4-5) : 547 –558.
Sharp P M, Li W H, 1986. An evolutionary perspective on synonymous codon usage in unicellular organisms. J.Mol. Evol., 24 (1-2) : 28 –38. Doi: 10.1007/BF02099948
Sharp P M, Li W H, 1987. The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res., 15 (3) : 1 281 –1 295. Doi: 10.1093/nar/15.3.1281
Sørensen H P, Mortensen K K, 2005. Advanced genetic strategies for recombinant protein expression in Escherichia coli. J. Biotechnol., 115 (2) : 113 –128. Doi: 10.1016/j.jbiotec.2004.08.004
Sur S, Bhattacharya M, Bothra A K, Tisa L S, Sen A, 2008. Bioinformatic analysis of codon usage patterns in a free living diazotroph, Azotobacter vinelandii. Biotechnology, 7 (2) : 242 –249. Doi: 10.3923/biotech.2008.242.249
Verdoes J C, van Ooyen A J J, 2000. Codon usage in Xanthophyllomyces dendrorhous (formerly Phaffia rhodozyma). Biotechnol. Lett., 22 (1) : 9 –13. Doi: 10.1023/A:1005695101331
Warner J R, McIntosh K B, 2009. How common are extraribosomal functions of ribosomal proteins? Mol. Cell, 34 (1) : 3 –11.
Warner J R, 1999. The economics of ribosome biosynthesis in yeast. Trends Biochem. Sci., 24 (11) : 437 –440. Doi: 10.1016/S0968-0004(99)01460-7
Wilson D N, Cate J H D, 2012. The structure and function of the eukaryotic ribosome. Cold Spring Harb. Perspect.Biol., 4 (5) : a011536 .
Wright F, 1990. The ‘effective number of codons' used in a gene. Gene, 87 (1) : 23 –29. Doi: 10.1016/0378-1119(90)90491-9
Wu G, Culley D E, Zhang W W, 2005. Predicted highly expressed genes in the genomes of Streptomyces coelicolor and Streptomyces avermitilis and the implications for their metabolism. Microbiology, 151 (7) : 2 175 –2 187. Doi: 10.1099/mic.0.27833-0
Xu C, Dong J, Tong C F, Gong X D, Wen Q, Zhuge Q, 2013. Analysis of synonymous codon usage patterns in seven different citrus species. Evol. Bioinform., 9 : 215 –228.
Yang F F, Xu D H, Zhuang Y Y, Yi X Y, Huang Y S, Chen H J, Lin S J, Campbell D A, Sturm N R, Liu G X, Zhang H, 2015. Spliced leader RNA trans-splicing discovered in copepods. Sci. Rep.-UK, 5 : 17 411 . Doi: 10.1038/srep17411
Zhang H, Finiguerra M, Dam H G, Huang Y S, Xu D H, Liu G X, Lin S J, 2013. An improved method for achieving highquality RNA for copepod transcriptomic studies. J. Exp.Mar. Biol. Ecol., 446 : 57 –66. Doi: 10.1016/j.jembe.2013.04.021
Zhang H, Hou Y B, Miranda L, Campbell D A, Sturm N R, Gaasterland T, Lin S J, 2007. Spliced leader RNA transsplicing in dinoflagellates. Proc. Natl. Acad. Sci. USA, 104 (11) : 4 618 –4 623. Doi: 10.1073/pnas.0700258104