Keqiang Wang

and 8 more

Integrating process-oriented (PO) and machine learning (ML) models is effective for obtaining dynamic spatial information on soil organic carbon (SOC) stocks. However, PO-ML integration, particularly at large scales, has received insufficient attention. This gap limits our understanding of and predictive capabilities regarding SOC dynamics. To explore the adaptability and effectiveness of PO-ML integration on a large scale, we constructed a national-scale PO-ML hybrid model. Due to the extensive research on cropland, the inapplicability of wetlands, and the uncertainty in estimating other natural land cover types, we used the PO model to expand the time series of nationwide non-waterlogged mineral soil natural land cover SOC density data (excluding wetland and croplands) for the period 2000–2014 to enhance the ML model training data and predict the spatial distribution of the average SOC content during this period. The results indicated that the ML model’s accuracy ( R 2 = 0.57) aligned with the average level reported by other digital soil mapping (DSM) studies, whereas the accuracy of the PO-ML hybrid model was approximately 17% above the upper limit reported in other DSM studies. This improvement highlights the advancement our research contributes to the field. Furthermore, the study demonstrates the important role of dynamic environmental covariates in predicting SOC density by showing that they significantly enhanced the model’s ability to capture the spatiotemporal dynamics of SOC. Moreover, in the absence of Rothamsted carbon model simulation data, the ML model exhibited higher uncertainty in the sample-scarce western regions and around the latitudes of 30°N–40°N, whereas the PO-ML model effectively reduced this uncertainty. These findings indicate that the hybrid model strategy offers significant advantages in SOC simulation and provides important insights into the nationwide spatiotemporal distribution of SOC in non-waterlogged mineral soils’ natural land cover.

Liangyi Li

and 12 more

Soil contamination by heavy metals has become a significant issue threatening the ecological security of global agriculture, particularly in arid regions, where accurate monitoring of low-concentration heavy metals remains a technical challenge. This study proposes a proximal sensing method based on the fusion of visible-near infrared (Vis-NIR) spectroscopy and portable X-ray fluorescence (pXRF) sensors, aiming to address the limitations of traditional single sensors in predicting low-concentration heavy metals in arid farmland areas. Using 116 farmland soil samples from the Qapqal Xibe Autonomous County in Xinjiang, the study systematically evaluates the modeling effects of 225 spectral preprocessing combinations on predicting four heavy metals: arsenic (As), lead (Pb), copper (Cu) and cadmium (Cd). The study found that Vis-NIR spectroscopy outperforms pXRF in predicting low-concentration heavy metals, and after data fusion, Vis-NIR with 1.75-order differential preprocessing achieved the best prediction performance. On the other hand, pXRF is not suitable for fractional order differentiation (FOD) preprocessing. The model accuracy was significantly improved by employing differentiated spectral preprocessing combinations, particularly for As, with an R 2 of 0.72, LCCC of 0.76, and RPIQ of 3.27. Furthermore, the analysis of critical characteristic bands revealed that the characteristic bands of As, Pb and Cu are mainly concentrated in the low-energy region (5-16 keV) of pXRF, providing an essential spectral basis for heavy metal feature extraction. This study innovatively proposes differentiated preprocessing strategies and highlights the critical role of pXRF low-energy region spectra in heavy metal prediction. The research provides a scientific basis for heavy metal monitoring and ecological risk assessment of farmland in arid areas, which has significant practical value, contributing to improved environmental quality and the safety of agricultural products.