3.3 Genome annotation
A total of 79,136,004 bp repetitive sequences were obtained in theS. chinensis genome, yielding a repeat percentage of 29% (Table
S6). A total of 14,089 (15,987 transcripts) genes were predicted to
encode proteins. There were 97.37% of the annotated genes located on
the 13 chromosome-level scaffolds (Figure 2B). The average CDS length,
exon number per gene, exon length and intron length were 1,536 bp, 73,
212 bp and 910 bp, respectively, similar to those in most of the
reported aphid species (Table S7, Figure S2). According to our results,
96.9%, 97.7%, 97.8% and 96.7% of BUSCO genome/gene sets were
identified in the S. chinensis genome in comparison with
Eukaryota, Arthropod, Hemiptera and Insecta datasets, respectively,
demonstrating the completeness of the gene set (Figure 4B). The
percentage of RNA-Seq reads assigned to a gene set was up to 80% (Table
S3). Among the 14,078 predicted genes, 12,584 (89.32%) were
functionally annotated, including 9,272 (65.81%) genes found via GO
database and 7,285 (51.71%) genes via KEGG database (Table 2).
Non-coding RNAs (ncRNAs) were also identified in the S. chinensisgenome, including 130 tRNAs, 29 rRNAs, 29 miRNAs, and 72 snRNAs (Table
S8).
3.4 Phylogenetic
analysis
Protein sequences of S. chinensis and eight other closely related
species were retrieved from public databases along, B. tabaci as
an outgroup. A total of 3479 single copy orthologous groups extracted by
OrthoMCL were incorporated to construct the phylogenetic tree. The
results showed that S. chinensis was a sister taxon to the wooly
apple aphid E. lanigerum . The two Eriosomatinae species diverged
from their common ancestor at approximately 57 million years ago (MYA)
(Figure 5). Eriosomatinae and
Aphidinae (including Ap. glycines , R. maidis , Ac.
pisum , M. persicae or D. noxia ) diverged from their
common ancestor at about 63 MYA, similar to the previous study (Mather
et al., 2020). Compared with the subfamily Chaitophorinae (includingS. flava ) in the family Aphididae, the subfamily Eriosomatinae
has a closer relationship with the subfamily Aphidinae. Significant
expansion and contraction of gene families is usually related to the
adaptive divergence of species. To elucidate the key genomic changes
associated with adaptation, expansion and contraction of gene families
were analyzed in all the nine aphids and B. tabaci . Eriosomatinae
showed 40 expanded and 986 contracted gene families compared with those
of the common ancestor of Aphidinae and Eriosomatinae (Figure S4A). KEGG
and GO enrichment analyses suggested that most of the expanded genes
were involved in the detoxification of natural xenobiotics from plants
(Figure S4B, S4C). S.
chinensis genome displayed 235 expanded and 1,037 contracted gene
families compared with of the common ancestor. KEGG pathway enrichment
analysis suggested that most of the expanded gene families were involved
in IL-17 signaling pathway, arachidonic acid metabolism, NF-kappa B
signaling pathway, ovarian steroidogenesis, VEGF signaling pathway,
necroptosis, regulation of lipolysis in adipocyte, TNF signaling
pathway, and c-type lectin receptor signaling pathway (Figure S4E).
Similarly GO annotation analysis revealed that most of the expanded gene
families were involved in prostaglandin-endoperoxide synthase activity,
arachidonate 15-lipoxygenase activity, nucleosomes, ovarian cumulus
expansion, intrinsic apoptotic signaling pathway in response to osmotic
stress, regulation of fever generation, regulation of platelet-derived
growth factor production, response to lead ion, and chromatin assembly
or disassembly (Figure S4D, Table S9). The expanded gene families of theS. chinensis genome were enriched not only in detoxification but
also in the immune system.