Pruning the tree: comparing OTUs and ASVs in High-Throughput Sequencing
of 5S-IGS nuclear ribosomal DNA in phylogenetic studies
Abstract
Amplicon sequencing of the nuclear ribosomal 5S RNA gene arrays is
highly promising for genotaxonomy, to resolve species’ genetic resources
and tracing evolution. However, the huge amount of data retrieved with
this approach is difficult to manage and prone to redundancy, error, and
computational difficulties. Reducing the amount of data per sample
without losing the conveyed molecular-phylogenetic signal is therefore a
crucial step for downstream analyses. In this work, we compared
Operational Taxonomic Units (OTUs) and Amplicon Sequence Variants (ASVs)
from 5S intergenic spacer (5S-IGS) amplicons of seven beech species
(Fagus spp.) obtained with two widely used and competing bioinformatics
tools, MOTHUR and DADA2. We assessed qualitative and quantitative
differences among sample profiles obtained with the two methods and the
capacity of the derived phylogenetic inferences to enclose pivotal
5S-IGS variant types. Over 70% of processed reads were shared between
OTUs and ASVs. Despite a strong reduction (>80%) of the
representative sequences, DADA2-ASVs identified all main 5S-IGS variants
known for Fagus, fully reflecting the overall genetic diversity patterns
within each sample. In contrast, large proportions of low-abundant
representative amplicons appeared in MOTHUR-OTUs and -ASVs profiles and
were inference-wise redundant. We conclude that differences in the
sequence variation detected by the two pipelines are minimal and provide
no exclusive phylogenetic information. DADA2 ASVs are handier and may
thus efficiently replace OTUs in future 5S-IGS studies aimed at
deciphering complex bio-ecological phenomena such as hybridisation,
polyploidisation, drift and inferring evolutionary pathways of species
systems, especially when using increasingly large sample sets.