4. Discussion
Over the last 20 years, CONAFOR has invested significant time and
resources to produce forest inventory data that accurately represents
all forest ecosystems in Mexico. To further expand the utility of this
data, we developed an analytical framework to model, predict, and map
forest structural attributes (tree density and height) across the
country. By exploiting the available open access of remotely sensed data
(e.g., mean land surface temperature, LAI, NPP, FPAR) (Gorelick et al.,
2017), the ensemble machine learning method in the LANDMAP package
v0.0.14 for R v4.1.0 (Hengl et al., 2021; RStudio Team, 2021), and the
INFyS data (CONAFOR, 2017), we have modeled and performed predictions of
tree height and tree density across Mexico. Results suggest that the
ensemble ML algorithm had a better performance when predicting tree
height than tree density (Table 1). In addition to providing numerical
estimates, these maps are user-friendly devices that help users
visualize forest structures across Mexico.
Mapping forest attributes along with associated uncertainties at a
national scale requires substantial computational resources. We
simplified our approach by modeling at a 1000-m resolution and reducing
the number of model predictors, thus reducing computing costs and still
displaying valuable nation-wide maps for biodiversity studies and
ecologic matters. Nevertheless, previous studies have shown that high
resolution satellite data (e.g., 30 m) has helped achieve an increase in
predictive ability (Hengl et al., 2021). It is important to acquire
sufficient computational resources for the project’s next stage and
perform predictions with high-resolution covariates. Both tree height
and density had strong univariate correlations with remotely-sensed
predictors like canopy cover, FPAR and LAI. Previous studies have shown
that, using more than one vegetation trait as model predictors can
reduce prediction uncertainty when mapping forest attributes (Saarela et
al., 2020). These results give a sense of the directionality of the
relationships between the modeled attributes and its environment and
strengthen the conviction of monitoring forest change through time.
The range of mean predicted values for tree height were consistent with
forest inventory data (~5-10 m). These results suggest
that predictions using the Super Learner model reflected the input data
adequately. On average, cloud mountain forest is the forest type with
the tallest trees in Mexico (Table 1). This particular forest belongs to
humid and temperate areas; it has the largest aerial biomass density and
the greatest timber volume of all Mexico forest types but it accounts
for only ~1% of the national forest area (Villaseñor &
Gual, 2014). According to CONAFOR (2017), more than half of its
vegetation is in early stages of succession, with high densities of
young and smaller trees. Maps of tree height, therefore, can indicate
areas that deserve more attention, such as the wide exploitation of
cloud mountain forest goods. Estimates of tree height are also critical
for the evaluation of forest structure (e.g., successional stages) and
projecting Mexican forests growth trajectories under different
management scenarios.
Mean predicted tree density values were smaller than the field-sampled
inventory data (Table 1). Globally, 42.8% of the planet’s trees exist
in tropical and subtropical regions (Crowther et al., 2015). Generally,
optimal conditions for tree growth are warm temperatures and moisture
availability (Leathwick & Austin, 2001). In accordance with this
assumption, tropical forests, which develop in a warm and moist
environment, have the highest tree density of all Mexico forest types
(maximum values of ~1370 trees/ha). The highest forest
densities can be observed in the Calakmul rainforest area located within
the Yucatán Península, in the southeast of Mexico (Fig 5a). The Calakmul
rainforest is part of an important ecological gradient, the Mesoamerican
Biological Corridor. The conservation of this ecologically important
region has been a challenge due to continuous forest disturbances. Tree
density has been used as an indicator for forest degradation on tropical
ecosystems (Román-Dañobeytia et al., 2014), therefore we encourage the
long-term monitoring of tropical forest structure.
For both target variables, uncertainty in our predictions was below 50%
in most forests. Our uncertainty maps also show areas where the model
performs poorly, especially in northern areas which consist of arid and
semi-arid ecosystems (>80% uncertainty). These ecosystems
have fewer sampling plots, which leaves less training data for modeling
over a considerably large area of Mexico. The diversity of Mexican
forests and the limited land access imply a logistics challenge for the
forest inventory and this causes an under-representation of specific
forest areas. One potential use for our uncertainty maps is for the
INFyS to identify certain areas that require more sampling plots (e.g.,
arid and semi-arid ecosystems) and to identify new sampling locations on
the areas with poor modeling accuracy (e.g., areas with high
uncertainty).
Data from this study was managed under the FAIR principles (Findability,
Accessibility, Interoperability, and Reusability) for scientific data
management by setting up an open-access online data repository available
at the Environmental Data Initiative (EDI):
https://doi.org/10.6073/pasta/4620375aea631ab6a09cb573c7bf8aff. Having
well-documented methods, FAIR research protocols, and a good
documentation of forest inventory data for all users can help advance
the science and policy relevant to forestry research and management.
Continuous improvement in the study design we present here is encouraged
in order to improve the accuracy of predictions. For instance, we
suggest acquiring remote sensing data at a higher resolution, increasing
computational capacity, assessing new spatial prediction models, and
locating new sampling sites in ecosystems with poor map quality
indicators (e.g. r2, RMSE) or uncertainties
>80%. Finally, the results of this study can facilitate
the understanding of Mexican forest ecosystems by further applying this
methodological framework for the mapping of other forest attributes such
as AGB, soil and vegetation carbon storage and their associated
functional traits. To achieve this, it is important to continue with
active forest inventory campaigns that facilitate the estimation of
forest structure patterns through time.
Here we develop a methodological framework for the spatial prediction of
forest attributes, which assists the understanding of forest structure
and expands institutional and technical capabilities for data analysis
within the National Forestry Commission of Mexico. Out of ten forest
ecosystems, our analyses show that the best predictive performance when
mapping tree height was in tropical dry forest and broadleaf forest
(model explained ~50% of variance). The best predictive
performance when mapping tree density was in tropical forest (model
explained ~30% of variance). For both target variables,
uncertainties in our predictions were below 50% in most forests.
Our results suggest that an ensemble learning framework can be
successfully used for the spatial prediction of forest attributes and
can likely be improved by having a larger number of field observations
and sufficient model predictors that reflect the environment of each
forest ecosystem. In order to ensure best practices for forest
management in Mexico, it is important that governmental and academic
institutions work together to develop approaches. This strategy helps
improve the quality and transparency of forestry datasets.