Data analysis
Generalized Linear Models (GLM) were used for understanding how environmental variables can be used to predict the current distribution of pine trees in western Fennoscandia. The spatial extent of the model includes Norway, Sweden and north Finland, while the temporal extend is the year 2023 for the current predictions and 2073 for the forecast. Three independent GLMs with a binomial and logit link function were specifically used to test the current pine distribution (presence vs. absence) set as a response to temperature (min) and precipitation as fixed factors sourced from the CMIP5 data. Each of the GLMs employed three distinct datasets (PLOT, NFI, ART). An interaction between temperature and precipitation was also tested, but it was omitted from this study because it did not improve the fit of the model. The fit of the model was evaluated by comparing the AIC values. The model diagnosis was done by assessing the spread of the residual and by plotting a Receiver Operator Characteristic Curve (ROC), both tools widely used in SDM . The area under the ROC curve (AUC) was also extracted, which indicates the efficiency of the model to predict. The value ranges between 0 and 100, with 100 being a highly accurate model. A p-value < 0.05 was used as the threshold for a significant relationship between response and predictors. The forecast of the pine distribution was made by extrapolating to a map of Fennoscandia the three GLM responses linked to the future environmental data sourced from CMIP5. The AUC and the coefficient of determination was used to rank the models with distinct data collection methods. All statistical analysis was conducted in “R 4.0.2” . Maps and figures were done in “R” with the packages “ggplot2” version 3.3.5 and “ggmap” version 3.0.0.903 .