2.7 Gene family analyses

We identified the homologous gene families involved in flowering time, flower development, and flavonoid and carotenoid biosynthesis inS. tetraptera . The known genes from each family were downloaded as the query to search against the S. tetraptera genome using BLASTP (Rédei, 2008). HMMER (Eddy, 2011) was then used to search for previously known domains from corresponding gene families for the candidate sequences. The candidate genes not harboring the domains searched for were removed. All the query sequences and the previously known domains are summarized in Table S23-24 and Table S27-28. For each gene family, MAFFT was used to align the protein sequences. IQ-TREE was used to construct the phylogenetic trees with default parameters (Nguyen et al., 2015), and further illustrated by EVOLVIEW (Z. He et al., 2016). We also predicted the transcription factors in the S. tetrapteragenome using PlantRegMap (Tian, Yang, Meng, Jin, & Gao, 2020) and the PlantTFDB database (Jin et al., 2017). In addition, clusterProfiler v3.6.0 (R package) (G. Yu, Wang, Han, & He, 2012) was used to analyze the enrichment of gene families in this study.