Ecological interpretations from amplicon sequencing data

Persistent challenges in linking sequences to ecology

As amplicon sequencing is the detection of a section of a single gene, the taxonomic resolution and ecological insights that can be extracted remain limited. It is critical to consider that taxonomic classifications can be influenced by the reference database selected, many of which remain incomplete due to bias in the types of organisms for which we have reference sequences (51)⁠. Often it occurs that ASVs within a given study are similar to a given taxon at the phylum level taxonomic rank but cannot be described at the higher taxonomic levels. From this perspective, it is critical to point out that function is not conserved at the phylum level (or even genus level), and therefore processes cannot be predicted and assigned to taxa using amplicon sequencing in a meaningful way for ecological investigations (52, 53)⁠. For example, assignment of taxa into r-strategists via their taxonomic affiliation with a phylum that is generally assumed to represent fast-growing organisms among soil microbiologists (e.g. Proteobacteria), and using these assumptions to explain processes in soil samples, should be avoided (Jeewani et al. 2020). However, it is possible to conduct amplicon sequencing using functional genes, when a specific functional guild is of interest. 
Most research on diversity of amplicons of functional genes has been carried out with prokaryotic genes pertinent to nitrogen cycling, particularly the dinitrogen reductase (nifH) and ammonium monoxygenase (amoA) genes (Aigle et al. 2019; Angel et al. 2018; Pjevac et al. 2017), with fungal genes being more recently targeted, either at the DNA or RNA levels (Entwistle et al. 2018; Hannula and van Veen 2016). This research allowed unprecedented insights into identities of functionally delimited microbial guilds in various environments, including soils. Early research looking at congruence of gene phylogenies in symbiotic diazotrophs between symbiotic (located on a symbiotic island within the genome) and housekeeping genes indicated various processes playing a role in evolution of the microbes, with the vertical transmission of genetic information being a prominent contributor of functional traits in prokaryotes (Menna and Hungria 2011). This finding also demonstrates an important issue with inferring functional information from phylogenetic barcoding such as 16S. 

Suggestions for more robust statistical analyses in sequencing studies

Data generated from amplicon sequencing is inherently compositional and provides relative abundances, which are independent of the total microbial load of the original sample. It has been previously shown that analyzing compositional datasets with standard statistical techniques (including Pearson correlations or t tests on proportions) can lead to very high (up to 100%) false positive discovery rates (56, 57)⁠. The potential high false positive rates will undoubtedly lead any data set to present some correlations with microbiome data, which is, for the soil science, at an unprecedented rate given that microbiome data presents thousands of different individual variables. The possibility to obtain significant results, therefore, may also lead to an “abuse” of the statistical significance (also referred to “p hacking”). While exploratory analysis is useful, researchers should always remember that an effect or association does not exist just because it was statistically significant, and even more important is that inference should be scientific and not merely statistical. In recent years, the discussion around the abuse of p-values and their importance has risen (58–60)⁠, and some alternative options have been proposed (60)⁠, including the use of more stringent p-values for claims of new discoveries (61, 62)⁠. Clearly the issue is much more complicated than a simple critique to the p-value, but involves scientific research at all levels, including the publish or perish culture insinuated in academic fields, and therefore we address the reader to further explore this topic through the above-mentioned citations.
Nevertheless, the issue of generating false conclusions based on spurious correlation exists, which include the variability inherent in amplicon sequencing data. When adopting a “let’s sequence and see” approach, many correlations (including false positive) will be generated. Given that exploratory research often leads to follow-up research, increasing our confidence will reduce the chances of research born on unsubstantiated findings. Adopting a more stringent p-value threshold will reduce the false positive rate, at the cost of type II errors. In order to avoid this, if we wanted to adopt a more stringent p-value while maintaining statistical power, it was shown that a 70% increase in sample size has to be achieved. We understand that this is often unrealistic, but we also recognize that this could save future efforts born on unsubstantiated research. Instead, currect research often focus more often on expanding the depth of analyses on the same few samples at the expense of replication.
Considerations of soil intraplot variability or number of replicates used to analyze similarities/dissimilarities of microbial communities directly affects the ability to detect differences. To showcase how increasing sample size can increase statistical power in soil microbiome analyses, we calculated the dependency of permutational multivariate analysis of variance (PERMANOVA) statistical power to effect size with different number of replicates. Although the data set chosen  \cite{Zheng_2019}  captures a wide range of possible microbial communities, this is far from being representative over all possible soil environments. Therefore, we warn the reader to interpret the data shown only as an example. We used the R package micropower \cite{Kelly_2015} which allows to simulate distance matrices from a set of parameters to generate available PERMANOVA power or necessary sample size for a planned microbiome analysis. We used data from both the 16S rRNA gene and the ITS1 region filtered to include only bacteria and archaea (16S) and fungi (ITS). We calculated the Jaccard similarity index (Supplementary Fig. 1a,b) and used the average and standard deviation across all samples as parameters in the micropower package to simulate OTU/ASV tables with similar parameters. We also calculate the average statistical power (  ω2 ) for a range of effect sizes for the 16S data (Fig. 3b), defined as 'Low' (0.001-0.04), 'Medium' (0.04-0.08) and 'High' (0.08-0.12). Our analysis shows that, while for strong differences in microbial community the number of replicates does not affect the statistical power, by increasing the replicate number from 4 to 5 we were able to almost double the statistical power for small effect size ('Low') and achieve a power above 0.8 for medium effect sizes. These effects were even stronger when we doubled the number of replicates (4 to 8). Similar effects were obtained for the fungal data set (Supplementary Fig. 1c).