Sequencing results
A total of 277,021,343 merged paired-end sequencing reads passed the initial bioinformatic filtering (~88% of the total raw reads). From these filtered reads, we identified 62,764 unique ASVs. Only 12,792 of those ASVs matched taxa in the database, but this set of ASVs with taxonomic matches accounted for 88.2% of the reads. Accordingly, non-metazoan ASVs (those without a taxonomic assignment) accounted for only ~12% of the reads.
Libraries for blanks and negatives yielded an order of magnitude fewer reads on average than the samples (SI, Fig. S3), and 30% of those reads and 85% of the associated ASVs did not match taxa in the metazoan database. In total, 3.5% of the metazoan ASVs detected across all samples were only found in the blanks and negative control samples. To eliminate the potential influence of such DNA contamination, some studies have removed all taxa detected in negative controls from their results. Although there is appeal in taking a more quantitative approach and subtracting the read counts for contaminant taxa in blanks from read counts across all samples, this subtraction approach does not account for the lack of template DNA in blanks and negative controls, which leads any DNA contamination to be heavily amplified (McKnight et al., 2019). Instead of subtracting read counts, we attempted to use themicroDecon R package (McKnight et al., 2019) to subtract reads proportionately for taxa found in negative controls; however, this had the effect of entirely eliminating species that were truly present in our reference DNA pool—a type-II error. To avoid introducing such false negatives, we reported (SI, Fig. S4) – instead of subtracted – these reads for our samples.
Over 55% of the 12,792 ASVs that provided taxonomic information based on our metazoan reference database matched database sequences at the level of species or genus according to our assignment rule (see above). For ASVs that matched at least one taxon in our database, the number of significant BLAST hits ranged from 1 to 500 taxonomic hits (mean = 123; median = 34) and 23.1% of these ASVs recovered species-level assignments (2,956 ASVs), 32.7% of ASVs were assigned to genus (4,181 ASVs), 13.2% to family (1,685 ASVs), 7.7% to order (986 ASVs), 13.8% to class (1,769 ASVs), 0.6% to phylum (76 ASVs), and 8.9% to domain (1,139 ASVs).