Statistical analyses
We fit statistical models to assess how 1) richness, 2) dispersion, and 3) turnover change in response to the disturbance relative to undisturbed control samples immediately after disturbance (i.e., before-after analyses), and through time. Unless otherwise specified, we performed all analyses at the ASV level. Before-after analyses compared data from prior to the disturbance to samples taken < 4 days post disturbance, and models included before-after and environment (i.e, aquatic, mammal, soil) as categorical covariates that were allowed to interact. A list of the studies excluded from before-after analyses due to lack of samples is included in Table S1. We assessed diversity change through time with models fit to data from the first 50 days post-disturbance only (i.e., pre-disturbance samples were not included, with the exception of the turnover analysis that quantified the compositional change between pre-disturbance and post-disturbance assemblages). We fit time series models with time (in days) as a continuous covariate; time was centered by subtracting the mean duration from all observations prior to modelling and allowed to interact with the environment. We fit all models with the same, hierarchical grouping (or random-effects) structure. To account for methodological variation between studies, we included varying intercepts in all models; and, because many studies included more than one disturbance type (e.g., (Džunkováet al. 2016), we included varying slopes and intercepts for time series within studies (i.e., one time series per disturbance type).
We assumed a negative-binomial error distribution and a log-link function for models fit to species richness calculations (i.e., the before-after and time series models). In addition to the parameters and the grouping structure described above, the shape parameter of the negative-binomial distribution (that estimates aggregation) was also allowed to vary among studies. Models fit to the z-transformed compositional change metrics (i.e., Bray-Curtis) assumed Gaussian error distributions and identity-link functions. In addition to the parameters and grouping structure described above, we also modelled residual variation as a function of environment to account for heteroscedasticity, with additional random variation among studies also modelled.
For Bayesian inference and estimates of uncertainty, we fit models using the Hamiltonian Monte Carlo (HMC) sampler Stan (Carpenteret al. 2017), which were coded using the ‘brms’ package (Bürkner 2017). We used weakly regularising priors, and visual inspection of the HMC chains showed excellent convergence. All code for statistical analyses is available at https://github.com/sablowes/microbiome-disturbance.
RESULTS
Our dataset included 2588 samples in 86 time series from 29 studies (Table S1) belonging to soil micro- and mesocosms (n=49), seawater mesocosms (n=16), and mammalian microbiomes (n=21) that were sampled multiple times within 50 days after disturbance (Figure 2a). Across all samples, we detected 56,480 ASVs. Of the 824 families in our dataset, 85.6% were detected in seawater, mammalian, and soil microbiomes (Figure S3). Coverage was highest in mammalian microbiomes (0.98±0.02; mean ± sd), lowest and most variable in soil microbiomes (0.93±0.06), and was significantly different between environments (ANOVA, F=475.1, p<0.001, Figure 2b).