Chinese Journal of Oceanology and Limnology   2018, Vol. 36 issue(5): 1731-1745     PDF
Institute of Oceanology, Chinese Academy of Sciences

Article Information

YU Haiyang(俞海洋), DU Xinxin(杜昕昕), LI Xiaojing(李晓静), QU Jiangbo(曲江波), ZHU He(朱鹤), ZHANG Quanqi(张全启), WANG Xubo(王旭波)
Genome-wide identification and transcriptome-based expression analysis of sox gene family in the Japanese flounder Paralichthys olivaceus
Chinese Journal of Oceanology and Limnology, 36(5): 1731-1745

Article History

Received Jul. 13, 2017
accepted in principle Aug. 17, 2017
accepted for publication Sep. 11, 2017
Genome-wide identification and transcriptome-based expression analysis of sox gene family in the Japanese flounder Paralichthys olivaceus
YU Haiyang(俞海洋), DU Xinxin(杜昕昕), LI Xiaojing(李晓静), QU Jiangbo(曲江波), ZHU He(朱鹤), ZHANG Quanqi(张全启), WANG Xubo(王旭波)     
Key Laboratory of Marine Genetics and Breeding, Ministry of Education, College of Marine Life Sciences, Ocean University of China, Qingdao 266000, China
Abstract: Sox genes are transcription factors that ubiquitously exist in animal species, and share a conserved high mobility group (HMG) box. They play important biological roles in diverse developmental processes, such as sex determination and differentiation, chondrogenesis, neurogenesis, and early embryonic development. In this study, we identified 25 sox genes from genome and transcriptome of Japanese flounder Paralichthys olivaceus. These sox genes could be mainly classified into seven subfamilies (B1, B2, C, D, E, F, and K), and each subfamily exhibited a relatively conserved gene structure. Besides, subfamilies A and G were found exclusively in human and mouse, and sox32 in subfamily K only existed in teleosts. Compared with other mammals, some sox genes in teleosts had two duplicates. The loss, duplication, and divergence of sox genes during evolution provided an evidence for whole-genome duplication that occurred in the radiation of teleosts. The expression of Japanese flounder sox genes was also analyzed by FPKM value. Our results showed that certain sox genes exhibited obviously tissue-specific and spatio-temproal expression. Especially, gonal-basied expression analysis uncovered that sox7 and sox2 were ovary-biased, and sox8b was testis-biased. Moreover, sox10a was expressed specifically in ovary, and sox8a in testis. Therefore this study provide a solid foundation for future functional and evolutionary analysis of sox genes in Japanese flounder.
Keywords: sox     Japanese flounder     gene structure     gene expression    

Sox genes, characterized by the conserved high mobility group (HMG) box, encode a class of transcription factors with high sequence similarity to sex determining region of Y chromosome (SRY) gene in animals. Since 1990, when the first Sry gene was identified in human and mouse, more than 100 sox genes have been discovered in mammals, birds, reptiles, fishes, and insects (Sinclair et al., 1990; Gao et al., 2016). According to the conservatism of protein and nucleic acid sequences, sox genes are named as sox1 to sox32, and classified as 12 subfamilies (A, B1, B2, C, D, E, F, G, H, I, J, and K).

With the development of genome-wide sequencing, an increasing number of sox genes were identified from different species, for instance, five in nematode (Caenorhabditis elegans) (C. elegans Sequencing Consortium, 1998), seven in calcareous sponge (Sycon ciliatum) (Fortunato et al., 2012), eight in fruitfly (Drosophila melanogaster) (Crémazy et al., 2001) and African malaria mosquito (Anopheles gambiae) (Wilson and Dearden, 2008), nine in honey bee (Apis mellifera), red flour beetle (Tribolium castaneum) and jewel wasp (Nasonia vitripennis) (Wilson and Dearden, 2008), 14 in starlet sea anemone (Nematostella vectensis) (Magie et al., 2005), 19 in Japanese medaka (Oryzias latipes) (Cui et al., 2011), 20 in mouse (Mus musculus) and human (Homo sapiens) (Schepers et al., 2002), 23 in tongue sole (Cynoglossus semilaevis) (Gao et al., 2016), 24 in torafugu (Takifugu rubripes) (Koopman et al., 2004), and 27 in Nile tilapia (Oreochromis niloticus) (Wei et al., 2016) (Table 1). Genome-wide comparison has provided valuable information to understand the function and evolution of sox genes. With the advancing knowledge on sox gene family, researchers have discovered the essential function of sox genes as important transcription factors in the regulation of diverse growth and development processes. For instance, sox gene knockout and mutation have revealed that the function of sox genes in many aspects, including chondrogenesis (Ng et al., 1997), neurogenesis (Wegner, 2011), early embryonic development (Kikuchi et al., 2001), hematopoiesis (Chung et al., 2010), stemness (Tanimura et al., 2013), angiogenesis (Pennisi et al., 2000; Downes and Koopman, 2001; Matsui et al., 2006), cardiogenesis (Zhang et al., 2005), hair development (Irrthum et al., 2003), and sex determination and differentiation (Foster et al., 1994; Vidal et al., 2001). Specifically, sox9 has been confirmed to be an essential factor for the proper proliferation and survival of medaka germ cells (Nakamura et al., 2012). Sox7 and sox18 play redundant roles in vascular development and arteriovenous specification in zebrafish (Cermenati et al., 2008; Herpers et al., 2008). Sox17 has unique effects on primitive erythropoiesis in zebrafish (Chung et al., 2010).

Table 1 Members of sox gene in different species

Japanese flounder (Paralichthys olivaceus) is a valuable economic teleost in China, Japan, and Korea. Recent developments in genome and transcriptome sequencing techniques provide a better approach to identify Japanese flounder sox genes at a genome scale level. In the present study, we used the genome and transcriptome sequencing to identify sox genes in this species. Gene structure and expression profiles of sox genes were also analyzed. Our results provided a better way to further investigate the evolution and function of sox gene family in Japanese flounder.

2 MATERIAL AND METHOD 2.1 Ethics statement

The Japanese flounder individuals used in this study were obtained from Yellow Sea Aquatic Product Co. Ltd., Yantai, Shandong, China. All experimental procedures and investigations were supervised and approved by Ocean University of China, and were performed in accordance with the guidelines of China Government Principles for the Utilization and Care of Vertebrate Animals Used in Testing, Research, and Training (State science and technology commission of the People's Republic of China for No. 2, November 14, 1988).

2.2 Fish and sample collection

Japanese flounder individuals (three females and three males) of one year old were chosen from a larger cohort population. The fish were anesthetized with MS-222 (30 mg/mL) and killed by severing the spinal cord. Tissue samples, including heart, liver, spleen, kidney, brain, gill, muscle, intestines, stomach, testis, and ovary, were collected respectively frozen immediately in liquid nitrogen, and then stored in -80℃ for RNA extraction.

Embryos of different development stages were collected in the breeding season (late April). In order to facilitate sampling, we divided the embryo development process of Japanese flounder into six stages, namely, Stage 1 (from two cells to morula), Stage 2 (from early gastrula to late somites), Stage 3 (from hatching to 2 d after hatching), Stage 4 (before metamorphosis), Stage 5 (metamorphosis stage 1 to 2), and Stage 6 (metamorphosis stage 3 to 5). Samples of different stages were collected respectively and stored in -80℃ for RNA extraction.

2.3 RNA extraction and illumina sequencing

Total RNA was extracted from tissue and development stage samples using Trizol reagent (Invitrogen, Carlsbad CA, USA) according to the manufacturer's protocol. DNA contamination was removed by DNaseI (TaKaRa, Dalian, China). The high-quality RNA from each sample was used to construct the llumina sequencing libraries by Illumina TruSeq mRNA Stranded Sample Preparation Kit (Illumina, San Diego CA, USA) according to the manufacturer's protocol and a previous study (Zhang et al., 2016). All constructed cDNA library were quenched by Beijing Genomics Institute (BGI, Shenzhen, China). Base on the acquired sequence reads, the fragments per kilobase of exon model per million mapped reads (FPKM) was calculated to measure gene expression levels (Garber et al., 2011; Trapnell et al., 2012) according to the formula as follows (Weitschek et al., 2015):

where Nf is the count of mapped fragments, F is the total number of the mapped fragments, and L is the length (base pairs) of all exons of sox gene.

2.4 Identification of the Sox genes

In order to perform the analysis in a more comprehensive way, two procedures were used to identify the candidate sox genes in Japanese flounder genome and transcriptome. Sox protein sequences of human (Homo sapiens), mouse (Mus musculus), zebrafish (Danio rerio), fugu (Takifugu rubripes), and tilapia (Oreochromis niloticus) were retrieved from Ensemble ( and NCBI ( databases. All these Sox protein-coding sequences were queried against the Japanese flounder genome (Zhang, 2016, unpublished data) and transcriptome (Zhang, 2016, unpublished data) by local TBlastx search with the threshold level of E-value of 1e-5 (Altschul et al., 1997). Meanwhile, the conserved HMG box sequence of vertebrate Sox proteins (DHVKRPMNAFMVWS-RGERRKIAQQNPDMHNSEISKRLGKRWKLLS-ESEKRPFIEEAERLRAQHMKDYPDYKYRPRR-KKK) (Wang et al., 2002) was used as a query sequence in local TBlastn search against the Japanese flounder genome and transcriptome with the threshold level of E-value of 1e-5. The results of these two procedures were integrated together, and the Blast search ( and phylogenetic analysis were used to assess the accuracy of the candidate sox genes. Besides, the exon-intron structure of Japanese flounder sox genes were mapped by Blastn program.

2.5 Phylogenetic analysis of Sox gene

Before phylogenetic analysis, the Sox protein sequences of medaka (Oryzias latipes), spotted gar (Lepisosteus oculatus), and fruit fly (Drosophila melanogaster) were retrieved from Ensemble and NCBI. Sox proteins sequences of nine species (human, mouse, fruit fly, fugu, zebrafish, tilapia, medaka, spotted gar, and Japanese flounder) were used to construct the phylogenetic tree. The accession numbers of these protein were available in Table S1. Multiple Sequence alignment of all the Sox protein sequences were implemented by ClustalW before evolutionary tree analysis. The phylogenetic tree was constructed by the neighbor-joining method with Poisson model implemented in MEGA 7.0 program with bootstrap of 10 000 replications. Specifically, the evolutionary distances were computed using the Poisson correction method and all positions containing gaps and missing data were eliminated (Zuckerkandl and Pauling, 1965; Saitou and Nei, 1987; Kumar et al., 2016).

2.6 Analysis of Sox gene expression base on transcriptome

The transcriptomes of eleven tissues (heart, liver, spleen, kidney, brain, gill, muscle, intestines, stomach, testis, and ovary) and six embryonic development stages (Stage 1–6) were used to analyze sox gene expression in Japanese flounder. The sox gene expression levels in different tissues and stages were evaluated by normalized FPKM value. For sox gene expression in different tissues and embryonic development stages, FPKM≥1 was considered reasonable, and FPKM≥10 was considered as a threshold for high expression (Hart et al., 2013; Tsagaratou et al., 2014).

The biased/specific expression of sox genes in gonads (testis or ovary) was also analyzed based on RNA-Seq data. The testis/ovary-biased expressed candidate genes were identified when "FPKM≥1" and "|log2(FPKMovary/FPKMtestis)|≥2". The testis/ovary-specific expressed genes were identified when "FPKM≥3" in one gonads, but "FPKM≤1" in the other.

2.7 Sox gene expression quantified by qPCR

Extracted total RNA was transcribed by M-MLV Reverse Transcriptase (TaKaRa) enzyme according to the manufacturer's protocol. The primers used in this experiment were designed by Primer Primer 5.0 and listed in Table S2. Pre-experiment was conducted to test the specificity of primers. qPCR was performed in Light-Cycler 480 (Roche, Forrentrasse, Switzerland) with in 20 μL reaction volume, which contained cDNA templates 10 ng, primers (FW/RV) and 2×SYBR Green qPCR Master Mix (US Everbright Inc.). β-actin was selected as a reference gene (Zhang et al., 2013). The reaction procedure consisted of an initial polymerase activation of 5 min at 94℃, followed by 40 cycles at 94℃ (15 s) and 60℃ (45 s). The data were analyzed by the 2-ΔΔCt method. This experiment was performed according to a previous study (Gao et al., 2015).

3 RESULT 3.1 Sox gene subfamilies in Japanese flounder

Based on the relatively conserved HMG box domain of vertebrate Sox proteins (Bowles et al., 2000), the conserved HMG-box domain of vertebrate and Sox protein-coding sequences of five species were used to search against the Japanese flounder genome and transcriptome by Local Alignment Search Tool (BLAST+ 2.6.0). A total of 25 sox genes were isolated and identified from Japanese flounder genome which could be divided into seven subfamilies (Table 2), that is, subfamily B1 (including Posox1a, Posox1b, Posox2, Posox3, and Posox19), subfamily B2 (including Posox14 and Posox21), subfamily C (including Posox4a, Posox4b, Posox11a, and Posox11b), subfamily D (including Posox5, Posox6a, Posox6b, and Posox13), subfamily E (including Posox8a, Posox8b, Posox9a, Posox9b, Posox10a, and Posox10b), subfamily F (including Posox7, Posox17, and Posox18), subfamily K (including Posox32). Compare with human and mouse, which had only one sox gene copy, some teleost sox genes had two duplicates (except spotted gar (Hermansen et al., 2016)) (Table 3). These results suggested that these novel sox isoforms might derive from a teleost-specific evolutionary process.

Table 2 Members of sox gene family in Japanese flounder
Table 3 Sox gene number in Japanese flounder and the other species
3.2 Sox protein domain in Japanese flounder

The 25 Sox protein sequences identified from Japanese flounder were used to perform online protein sequence analysis by SMART ( The results showed that all the Sox proteins had a conserved HMG box of 79 amino acid residues (Gubbay et al., 1990). As shown in Fig. 1, both the motif sequence (positions 5-10) and the extended motif sequence (position 5-13) were highly conserved for all Sox sequences except PoSox32. Besides, some other fragments in the HMG box were also highly conserved. Interestingly, similar to Sox32 in medaka (Cui et al., 2011), and tongue sole (Gao et al., 2016), Japanese flounder Sox32 could be distinguished from other Sox proteins. For example, the amino acid at position 7 in PoSox32 motif was L, but it was M in the other Sox sequences. Moreover, residues at positions 11-13 in PoSox32 HMG box were IIW, which were also different from the highly conserved MVW of other Sox proteins in these positions. These results indicated that the function and evolution of Sox32 in teleosts might be more complex compared with other vertebrates.

Figure 1 Conserved HMG box domains of Japanese flounder Sox proteins All HMG box domains were predicted by SMART online program ( ClustalW and GENEDOC program were used to perform multiple sequence alignment of the amino acid sequence. B1, B2, C, D, E, F and K indicate the seven subfamilies of Japanese flounder Sox proteins. Residues in dark are conserved among all the sequences, and residues in gray are conserved in most sequences.