2.9 Comparative genomic and phylogenetic analysis
To identify the gene families for phylogenetic tree construction, we compared the genome assembly of C. sonnerati with other fish, including Epinephelus lanceolatus , Plectropomus leopardus ,Epinephelus akaara , Oreochromis niloticus , Lates calcarifer , Gymnodraco acuticeps , Pseudochaenichthys georgianus , Cyclopterus lumpus, Danio rerio , Salmo salar , Monopterus albus , Monopterus albus , Gadus morhua , Oncorhynchus mykiss , and Oryzias latipes .Latimeria chalumnae was used as an outgroup. All of the proteins were extracted and aligned to each other using BLASTP (Camacho et al., 2009) programs (NCBI blast v2.6.0) with a maximal e-value of 1e-5. The OrthoFinder (Emms & Kelly, 2015) method was used to cluster genes from these different species into gene families.
To reveal the phylogenetic relationships among C. sonnerati and the aforementioned fishes, protein sequences from 678 single-copy orthologous gene clusters were used for phylogenetic tree reconstruction. The protein sequences of the single-copy orthologous genes were aligned with the MUSCLE (v3.8.31) (Edgar, 2004) program, and the corresponding Coding DNA Sequences (CDS) alignments were generated and concatenated with the guidance of protein alignment. RAxML (v8.2.11) (Stamatakis, 2014) was used to construct the phylogenetic tree with the maximum likelihood method. The phylogenetic relationship of other fish was consistent with previous studies. We used the MCMCTree program of the PAML package (Yang, 2007) to estimate the divergence time among species.