2.3.3 Drivers of seed dispersal distances
Finally, 67 studies calculated seed dispersal distances. We used 45 of these to assess mean seed dispersal distances and 56 to assess maximum seed dispersal distances. This information came from 61 and 71 individual species, respectively (See supplementary material 2) . Six studies reported median dispersal distances rather than mean. As these were the minority, we decided to remove these from the mean dispersal analysis. A further two studies were removed from both the mean and maximum dispersal analysis as these focused on species during their migrations (Mallard duck, Red-billed teal & Egyptian goose) and the distances estimated did not allow for models to converge. As per previous analysis, studies focussing on fish or reptiles were also omitted because the number of studies were too small.
To assess the effect of different morphological traits and environmental variables on seed dispersal distances, we fitted GLMs, with a log-link function, comparing the predictor variables of body mass, protected areas, volant/non-volant, biome and the study site HFI score. We used volant/non-volant in models rather than taxa due to collinearity between these two variables and because it provides greater functional information about the species than taxa alone. We did not include tracking method because of the limitations inherent to different methods e.g. longer dispersal distances are often identified from GPS tags due to remote download capabilities, while resource tracking is often limited to certain species, in particular small sized mammals. Body mass and HFI were scaled and centred around the means to ensure that they were comparable.
Statistical assumptions for each GLM were validated by visual interpretation of residual diagnostic plots to check for linearity of model-fitted values and their residuals. For each analysis, link functions were tested to determine the best residual distribution model based on AIC comparison and visual analysis of quantile-quantile plots of produced residuals using the plot.DHARMa() function in theDHARMa package (v0.4.6; Hartig 2022). For the seed dispersal GLMs a gamma distribution and a log link function allowed for the best model fit. The dredge function in the MuMIn package (v1.47.1; Barton 2022) was used to assess the optimum variables for each model. Biome was not significant for mean or maximum dispersal models and was not included in subsequent models, nor was HFI.
All analyses were performed using R Statistical Software (v4.2.2; R Core Team 2021).