Genome evolution analysis
To investigate genome evolution, orthologous and paralogous genes were analyzed. CDSs of seven species, including D. rerio ,Xiphophorus maculatus , G. aculeatus , Datnioides undecimradiatus , Collichthys lucidus , L. crocea andMiichthys miiuy , were obtained from Ensembl or NCBI. The longest CDS of each gene, encoding at least 50 amino acids, was used. An all-to-all BLAST search with an E-value threshold of 1e-07 was performed, and genes were clustered into families using OrthoMCL (L. Li, Stoeckert, & Roos, 2003). Single-copy gene families were used to construct phylogenetic trees with a Bayesian model in PhyML (Guindon et al., 2010), and divergence times were estimated with PAML mcmctree (Yang, 1997). Calibration times were sourced from TimeTree (http://www. timetree. org/), including divergence times between D. rerio andL. calcarifer (80-252 Mya), D. rerio and X. maculatus (180-252 Mya), L. crocea and C. lucidus (4-24 Mya), L. crocea and M. miiuy (10-91 Mya), X. maculatus and G. aculeatus (100-130 Mya), and D. undecimradiatus and C. lucidus (63-86 Mya).
Gene family expansion or contraction was analyzed using CAFÉ (http://sourceforge. net/projects/cafehahnlab/) based on OrthoMCL and phylogenetic tree results. Positive selection analysis was performed using single-copy gene families as described by Zhou et al. (2019). Gene enrichment in GO and KEGG pathways was assessed with Fisher’s exact test, applying a false discovery rate correction (FDR < 0.05).
To identify positively selected genes involved in otolith development, we first selected the CDSs of seven key genes: otoconin-90 (oc90 ), otolith matrix protein 1 (omp1 ), otolin-1a (otol1a ), otolin-1b (otol1b ), otopetrin-1 (otop1 ), the secreted protein acidic and rich in cysteine (sparc ), and SPARC Like 1 (sparcl1 ) from seven species: D. rerio ,X. maculatus , G. aculeatus , C. lucidus , L. crocea , D. labrax , Takifugu rubripes and B. taipingensis . Then we conducted multiple-sequence alignments of these genes by MEGA11, and using PAML to calculate the non-synonymous substitution rate (Ka) and synonymous substitution rate (Ks) . We applied the branch-site model (cleandata=1) and selected the branch of large yellow croaker (L. crocea ) and Chinese bahaba as foreground branch for analysis. We selected the results which have a p-values positively selected genes. Finally, we constructed a maximum likelihood phylogenetic tree and enhanced its appearance using Evolview (http://www.evolgenius.info/evolview/#/treeview).