Materials and methods
Ethics statement
This study was conducted in sampling
procedures were approved by the Ethics Committee of Hainan Medical
University (approval number:
HMUEC20180059).
Swabs samples of rodents
We collected 588 throat and anal swab samples from 326 rodents from
multiple locations in 9 counties or cities in the Hainan Province, the
tropical island province in China, from May 2017 to June 2021. Samples
(15-20) were grouped into pools according to their sampling locations
and species. Morphology was combined with mitochondrial cytochromeb (mt-cyt b ) to identify murine species and analyse the
congruence between viruses and their hosts (Table S1). The collected
samples were quickly immersed in the maintenance medium in the
virus-sampling tube (Yocon Biology, Beijing, China) to ensure the sample
quality, and transported to the laboratory within 24 h using the
low-temperature cold chain [6]. The samples were divided into three
parts evenly upon arrival at the laboratory and were stored at -80 °C
storage before subsequent experiments,
which is consistent with a previous
study [6].
Viral nucleic acid library construction and
NGS
The 588 samples were combined into
28 pools based on swab type and sample location. Swab samples were
passed through 0.45µm filters (MilliporeSigma, Burlington, MA, USA) to
remove eukaryotic and bacterium-sized particles. The filtrate was
ultracentrifuged at 100,000 ×g and 4 ℃ for 3 h.
The precipitate collected from 28
pool samples was resuspended in Hank’s balanced salt solution and
digested with DNase (Applied Biosystems, Thermo Fisher Scientific,
Waltham, MA, USA) to decompose and remove unprotected nucleic
acids. Viral RNA was extracted
using a QIAamp Viral RNA Mini Kit (Qiagen, Hilden, Germany).
Viral
cDNA was generated using primer K-8N and Superscript III Reverse
Transcriptase (Invitrogen, Thermo Fisher Scientific),
as previously described [20].
Sequence-independent polymerase chain reaction (PCR) amplification
products were purified and subjected to magnetic bead sorting. Viral
nucleic acid libraries constructed by pre-processing, such as
sonication, were analysed using an Illumina HiSeq2500 sequencer
(Illumina Inc., San Diego, CA, USA) for a single read of 150 bp in
length. Sequence data were deposited in the National Center for
Biotechnology Information (NCBI) sequence reads archive under the
accession number PRJNA892773.
Metagenomic sequencing
Quality-controlled
reads of each sample were assembled using Trinity V2.5.1 [21].
DIAMOND was used to compare the contigs against the non-redundant
protein database from NCBI. All blastx results of contigs were annotated
with taxonomy id by an in-house database, which combined accession files
(ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/ ), and other
information obtained from the NCBI Entrez server. Non-redundant viral
contigs larger than 500 nucleotides (nt) were selected. Contigs related
to bacteriophages, plant viruses, and insect viruses were excluded. Read
mapping was performed using BWA-MEM [22]. Low-quality mappings were
removed using Converm (v0.6.1, https://github.com/wwood/CoverM )
with two preset parameters: min-read-percent-identity 0.90
–min-read-aligned-percent 0.75. The relative viral abundance was also
generated from Coverm using the transcripts per million (tpm) method. A
heatmap for visualising the mouse-related virome profile was plotted
using the pheatmap package in R.
Viral genome sequencing
Molecular clues from metagenomic analyses were used to classify sequence
reads into viral families or genera using MEGAN. To identify the partial
or complete genome, representative viral open reading frames
(ORF)-related reads were selected for read-based PCR and sequencing.
Reads with accurate genomic locations were used to design specific
nested PCR primers and to identify partial genomes, which is consistent
with a previous study [6]. cDNA was generated using random primers
and Superscript III Reverse Transcriptase (Invitrogen). The remaining
genomic sequences were analysed using genome walking and 5′- and 3′-
rapid amplification of cDNA ends (Invitrogen; and Takara Bio, Kyoto
Japan). The primers used to amplify the sequence obtained in this study
are shown in (Table S2).
Genome annotation
The
viral basic nucleotide sequences structure of the genomes and the amino
acid encoded by effective ORFs and its location were deduced by
comparing with the sequence of other virus
families. Prediction of
conservative protein families and domains using Pfam
(http://www.ebi.ac.uk/services/proteins), BLASTP
(https://blast.ncbi.nlm.nih.gov ), and InterProScan 5
(http://www.ebi.ac.uk/services/proteins ). Routine sequence
alignment was performed using Clustal Omega
(http://www.ebi.ac.uk/Tools/ ).
The
evolution tree beautification was performed using ITOL.
(https://itol.embl.de/ ).
Viral prevalence
Nearly complete or partial genomic
sequences of the viruses obtained by NGS sequencing were used as
templates to design specific, semi-nested primers for the
non-structural gene for PCR
and screening for viruses in each
filtrate sediment and individual sample
(Table S2). PCR was performed using
Go Taq Colorless Master Mix (Promega).
Nest PCR using two microliters of
the first-round PCR product as the template of the second round of PCR
has high specificity and sensitivity. The thermal cycling conditions for
both PCRs were 94 °C for 5 min, followed by 35 cycles at 94 °C for 30 s,
57 °C for 35 s, 72 °C for 30 s, and a final elongation step at 72 °C for
10 min. The PCR products were
analysed using 1.5% agarose gel electrophoresis and ultraviolet
imaging.
Phylogenetic and data
analyses
We used the ClusterW package to
align the nucleotide sequences and deduce the amino acid sequences
(https://rdrr.io/bioc/muscle/man/muscle-package.html) and default
parameters in MEGAX. Relatively conserved viruses were selected to
construct a phylogenetic tree using the maximum likelihood method.
According to the operation rules,
the best substitution model was evaluated using the model selection
package function of MEGAX with 1000 bootstrap replicates.
The NCBI basic local alignment
search tool was used to perform pairwise amino acid alignment between
the reference sequences and the new virus.