vAMPirus: A versatile amplicon processing and analysis program for
studying viruses
Abstract
Amplicon sequencing is an effective and increasingly applied method for
studying viral communities in the environment. Here, we present
vAMPirus, a user-friendly, comprehensive, and versatile DNA and RNA
virus amplicon sequence analysis program, designed to support
investigators in exploring virus amplicon sequencing data and running
informed, reproducible analyses. vAMPirus intakes raw virus amplicon
libraries and, by default, performs nucleotide- and amino acid-based
analyses to produce results such as sequence abundance information,
taxonomic classifications, phylogenies, and community diversity metrics.
The vAMPirus analytical framework leverages 16 different opensource
tools and provides optional approaches that can increase the ratio of
biological signal-to-noise and thereby reveal patterns that would have
otherwise been masked. Here, we validate the vAMPirus analytical
framework and illustrate its implementation as a general virus amplicon
sequencing workflow by recapitulating findings from two previously
published double-stranded DNA virus datasets. As a case study, we also
apply the program to explore the diversity and distribution of a coral
reef-associated RNA virus. vAMPirus is streamlined within Nextflow,
offering straightforward scalability, standardization, and communication
of virus lineage-specific analyses. The vAMPirus framework is designed
to be adaptable; community-driven analytical standards will continue to
be incorporated as the field advances. vAMPirus supports researchers in
revealing patterns of virus diversity and population dynamics in nature,
while promoting study reproducibility and comparability.