Statistical analysis
We addressed the study questions with several analyses, focusing on different dataset levels dependent on data availability. Thebiodiversity dataset contains all macrophyte recordings (274 mapped transects in 100 field campaigns, mapping of lake in one year is called field campaign ) of the selected 28 lakes. As no complete information is available for all mapped lakes and years, we compiled two subsets of the biodiversity dataset : The environmental & biodiversity dataset is a subset dataset with all macrophyte recordings for which all abiotic data (see Table 1) were available. This dataset includes data from 12 lakes, 27 field campaigns and 147 transects. For the biodiversity time series dataset we selected all lakes for which repeated mappings for at least 3 years were available. This condition was fulfilled for 17 lakes mapped in 73 field campaigns along 194 transects. Analyses for each research question are described below.
For the first question, concerning the general depth distribution pattern, we used the richness components including the different DDG measures and determined pattern types. We plotted as general DDG curves the mean and standard deviation of alpha, beta and gamma richness for each depth (Question 1.1). We performed simultaneous tests for linear models with multiple comparisons of means using Tukey contrasts that are robust under non-normality, heteroscedasticity and variable sample size (Herberich et al. 2010) to compare the richness across depth for significant difference. Furthermore, we plotted the different DDG peaks (DDG measures) for alpha, beta and gamma richness and determined the corresponding regression line by fitting a linear model. We classified the DDG curves for all three richness measures in four pattern types depending on the depth of the richness curve maximum: Decreasing (Dmax > -1m), shallow hump-shaped (Dmax between -1 and -2 m), deep hump-shaped (Dmax between -2 and -4 m) and increasing (Dmax < -4 m) (Fig. 1d). To determine the correlations between the different diversity components (Question 1.2) we performed a Pearson correlation test between depth dependent richness components. Furthermore, we tested for correlations between DDG measures across the different richness components. A Chi-square test helped to look at associations between pattern types and biodiversity components.
For the second question, concerning the drivers of the diversity depth gradient, we analysed the influence of abiotic data on the DDG using theenvironmental & biodiversity dataset . We log-transformed the abiotic and biotic data. To show that the diversity metrics of theenvironmental & biodiversity dataset are representative for the diversity metrics of biodiversity dataset we applied the PERMANOVA test adonis2 , using the R package ‘vegan’ which compares centroids and the variance (Oksanen et al. 2019). A non-significant result (p >0.05) confirms that centroids and variance of two groups are not different (Supporting information). To identify the driving factors on the richness peaks we used Generalized Additive Mixed-Effect Models (GAMMs), computed with the R package ’gamm4’ (Wood 2011). The D(α,β,γ,max)and R(α,β,γ,max) were used as response variables, the lake as random effect. To reduce the high correlations between abiotic factors (Pearson correlation test) we performed a Principle Component Analysis (PCA) analysis and named the main axis (>80% variance) after the corresponding abiotic factor, whenever an axis encompassed more than 40% of the variation of a variable. We used the loadings of the main PCA axes (>80% variance) as explanatory variables for the GAMM. We constructed a full model with all PCA axes, then we stepwise excluded the least significant terms until obtaining a minimal model (Wood 2008).
To answer the third questions about the temporal change of the depth diversity gradient, we used the biodiversity time series dataset . First, we calculate the Invariability Coefficient (IC) as inverse of the Coefficient of Variation (CV):
\begin{equation} IC=\frac{1}{\text{CV}}=\frac{1}{\frac{\text{sd}}{\text{mean}}}\ =\frac{\text{mean}}{\text{sd}}\nonumber \\ \end{equation}
The IC is a statistical tool to evaluate the degree of invariability also for datasets with different means (Question 3.1). To check for temporal trends, we built simple linear regression models for depth independent gamma richness and the DDG measures,D(α,β,γ,max) andR(α,β,γ,max), as response variables and time as explanatory variable for (a) the complete dataset and (b) each individual lake. We identified all models that showed significant linear trends (p <0.1) and characterized the direction of their slopes (Question 3.2).