Considerations of intraplot variability or number of replicates used to analyze similarities/dissimilarities of microbial communities directly affects the ability to detect differences. To explore how increasing sample size can increase statistical power in soil microbiome analyses, we calculated the dependency of permutational multivariate analysis of variance (PERMANOVA) statistical power to effect size with different number of replicates. Although the data set chosen \cite{Zheng_2019} captures a wide range of possible microbial communities, this may not be representative over all possible soil environments. Therefore, we encourage the reader to interpret the data shown only as an example. We used the R package micropower \cite{Kelly_2015} which allows to simulate distance matrices from a set of parameters to generate available PERMANOVA power or necessary sample size for a planned microbiome analysis. We used data from both the 16S rRNA gene and the ITS1 region filtered to include only bacteria and archaea (16S) and fungi (ITS). We calculated the Jaccard similarity index (Supplementary Fig. 1a,b) and used the average and standard deviation across all samples as parameters in the micropower package to simulate OTU/ASV tables with similar parameters. We also calculate the average statistical power ( ω2 ) for a range of effect sizes for the 16S data, defined as 'Low' (0.001-0.04), 'Medium' (0.04-0.08) and 'High' (0.08-0.12). Our analysis indicates that for strong differences in microbial community the number of replicates does not affect the statistical power. By increasing the replicate number from 4 to 5 we were able to almost double the statistical power for small effect size ('Low') and achieve a power above 0.8 for medium effect sizes (Figure 4a). These effects were more pronounced when the number of replicates was doubled (4 to 8; Figure 4b). Similar effects were obtained for the fungal data set (Supplementary Fig. 1c).
Hannes: Link back to spatial and temporal sections
Suggestions for more robust statistical analyses regarding time-series have been discussed in Coenen et al 2020 and others (24 other REFs).