1 Introduction
The Pyraloidea, with more than
16,000 described species worldwide, is one of the largest groups in
Lepidoptera, and it is composed of two families: Pyralidae and
Crambidae, with Crambidae species accounting for 60% (Munroe & Solis
1999, Nuss et al., 2023). Regier et al. (2012) present a most detailed
molecular estimate of relationships to date across the subfamilies of
Pyraloidea based on five nuclear genes, in which the Crambidae was
divided into three major lineages based on phylogenetic relationships:
the “PS clade” (Pyraustinae, Spilomelinae, and Wurthiinae), the “OG
clade” (Evergestinae, Glaphyriinae, Noordinae and Odontiinae), and the
“CAMMSS clade” (Acentropinae, Crambinae, Musotiminae, Midilinae,
Scopariinae and Schoenobiinae), forming a system of PS clade + (OG clade
+ CAMMSS clade). However, combined with the phylogenetic tree topology
of the Pyraloidea based on mitogenic data, the phylogenetic relationship
within “non-PS Clade” is not completely resolved in previous study
(Yang et al., 2018b; Zhang et al., 2020; Qi et al., 2021; Liu et al.,
2021). More molecular data, such as the mitogenomes, are in demand to
reveal the phylogenetic relationships of the subfamilies in Crambidae.
Spilomelinae is the most species-rich subfamily in Crambidae, with 4,135
described species in 344 genera (Nuss et al., 2023). Currently, a total
of 13 tribes in Spilomelinae have been defined by Mally et al. (2019)
based on six molecular markers (COI, CAD, EF-1α, GAPDH, IDH and RpS5)
and 114 adult morphological characters, including: Hydririni, Udeini,
Lineodini, Wurthiini, Agroterini, Margaroniini, Spilomelini,
Herpetogrammatini, Hymeniini, Asciodini, Trichaeini, Steniini and
Nomophilini. Among them, Trichaeini is a tribe with the lowest species
richness, with only four genera and 22 species (Nuss et al., 2023). This
tribe includes the genus Prophantis Warren, 1896, which consists
of eight species that have all been poorly studied besides their
original descriptions (Warren, 1896).
Only Prophantis
octoguttalis Felder & Rogenhofer, 1875 and P. adusta Inoue,
1986 have been recorded from China. P. octoguttalis , the type
species of the genus, is widespread, and is mainly distributed in
southern China, Australia, India, and the Afrotropical region (Wang,
1980; Ratnasingham & Hebert, 2007). Its larvae feed on Coffea
arabica Linnaeus, 1757, and a single larva can harm several berries in
succession, which can seriously impact coffee production
(Wang, 1980). The adults ofP. adusta are very similar in appearance to those of P.
octoguttalis , which makes species identification in these moths very
challenging.
The mitochondrial genome (mtDNA) is a closed-loop DNA double helix
molecule that varies significantly in length among taxa. The mtDNA of
lepidopteran insects is generally 15–16 kb in size and consists of 37
genes, including 13 protein-coding genes (PCGs), 22 transfer RNA genes
(tRNAs), two ribosomal RNA genes (rRNAs), and a control region of
variable length also known as A+T-rich region and D-loop region (Boore,
1999). Because of its conserved genetic components, compact arrangement,
fast evolutionary rate, and maternal inheritance, it contains relevant
genetic and developmental information that can be used in phylogenetic
studies for different research purposes (Wesley et al., 1979; Cameron,
2014). The mtDNA has been widely used in molecular phylogeny,
phylogeography and genetic differentiation (Heise et al., 1995; Suzuki
et al., 2013; Wang et al., 2019).
To date, only 23 mitogenomes of Spilomelinae have been published in
GenBank, and no mitogenomes of Trichaeini have been reported. In this
study, we sequenced the mitogenomes of P. octoguttalis andP. adusta of the Trichaeini for the first time, and performed
preliminary bioinformatics analysis, which can help us to understand the
features of mitogenomes of Trichaeini. Meanwhile, to understand the
phylogenetic relationship, indicated by mitochondrial genome, of
Trichaeini in Spilomelinae, we reconstructed the phylogenetic tree based
on the mitogenomes data of these two species with other available
mitogenomes data of Crambidae in GenBank by using maximum likelihood and
Bayesian inference methods. It will provides new perspectives and
genomics data for the phylogenetic research in Trichaeini and
Spilomelinae.
2 Materials and methods
2.1 Specimen collection and DNA sequencing
The specimen of Prophantis octoguttalis investigated was
collected from Wuzhi Mountain in Hainan Province, China, in March 2021;
the specimen of P. adusta was collected from Fanjing Mountain in
Guizhou Province, China, in September 2020. Fresh specimens obtained by
light trapping were soaked in anhydrous alcohol and stored at -80 °C in
the Insect Collection of Southwest University, Chongqing, China. DNA was
extracted from the thoracic muscle of each specimen. The mitogenome was
entrusted to BGI Genomics for
next-generation sequencing.
2.2 Sequence assembly, annotation and analysis
The high-quality data (clean data) of the samples, which were trimmed by
BGI Genomics, were saved as fastq. format and imported into Geneious
Prime v2022.1.1. The mitogenome with the closest affinity to the sample
as a reference sequence was downloaded from GenBank, and sequence
extension was performed using the “Map to reference” function until
repetitive base alignments appeared, indicating that the mitochondrial
genome was assembled into a loop.
MAFFT (Multiple Alignment using Fast Fourier Transform) alignment was
used to align the reference sequence with the sample sequence, and
protein-coding genes (PCGs) were determined based on the similarity
between genes. With the help of EditSeq v7.1.0, PCGs were translated
into amino acids to further verify the correctness of the start codon,
stop codon, and amino acid sequence, to ensure the accuracy of PCGs. The
location and secondary structure of tRNA genes were predicted using the
MITOS Web Server (Donath et al., 2019), and the chart of secondary
structure was mapped using Adobe Illustrator v26.0. rRNA genes are
relatively conserved, and can be determined by the position between the
two genes (Boore, 2006). The A+T-rich region was generally located
behind the rrnL gene. Mitogenome maps were generated using
Proksee (https://proksee.ca/). Sequence length, base composition, gene
spacing, and overlap were viewed directly using Geneious Prime
v2022.1.1. The base skew was calculated using the formula: AT skew = (A
− T) / (A + T) and GC skew = (G − C) / (G + C) (Perna and Kocher, 1995).
Relative synonymous codon usage (RSCU) was analyzed using MEGA v10.2.5.
2.3 Phylogenetic analysis
A total of 55 mitogenome sequences (2 newly determined in this study, 53
available from GenBank) were used to construct the phylogenetic tree.
The ingroups included 5 species of Acentropinae, five species of
Crambinae, one species of Glaphyriinae, three species of Odontiinae,
eight species of Pyraustinae, one species of Schoenobiinae, one species
of Scopariinae and 25 species of
Spilomelinae.
The four species (Lista haraldusalis , Galleria mellonella ,Dioryctria yiai and Pyralis farinalis ) of Pyralidae,Bombyx mori of Bombycidae and Helicoverpa armigera of
Noctuidae were selected as outgroups (Table 1).
We used two datasets: 1) PCG123: all three codon positions of 13
protein-coding genes; 2) PCG123RT: all three codon positions of 13
protein-coding genes, two rRNA genes and 22 tRNA genes. Maximum
likelihood (ML) and
Bayesian
inference (BI) were used to construct phylogenetic trees.
ModelFinder (Kalyaanamoorthy et al., 2017) was used to partition the
data based on Bayesian Information Criterion BIC, and find the best
partitioning scheme and base substitution models for ML and BI. Maximum
likelihood was analyzed using IQ-TREE v1.6.8 (Minh et al., 2013; Nguyen
et al., 2015), with the standard bootstrap of 1000 replications;
bootstrap values (BS) > 70% were considered to
represent high confidence. Bayesian inference was analyzed using MrBayes
v3.2.6, with the following parameters: two independent runs, each with
four independent Markov Chain Monte Carlo runs, including three heated
chains and one cold chain, were set to run for 1 × 107generations, with simultaneous sampling every 1,000 generations. The
initial 25% of the sampled trees were discarded as burn-ins. Chain
convergence was assumed when the mean standard deviation of the split
frequencies fell below 0.01. Bayesian posterior probability, in which
the support of each node of the BI tree was greater than or equal to
0.95, was considered high confidence. The phylogenetic tree was
constructed using Figtree v.1.4.4.