Hongyi Li

and 4 more

The aim of this paper was to compare the prediction performance of three strategies: general global Partial least squares regression (PLSR) using CSSL with and without spiking samples, memory-based learning (MBL) using CSSL with and without spiking samples and general PLSR using only spiking samples to predict soil organic matter in the target area. When using spiked subsets, we also investigated the prediction performance of the extra-weighted subsets. A series of spiking subsets randomly selected from the total spiking samples were selected by conditioned Latin hypercube sampling (cLHS) from the target sites. We calculated the mean squared Euclidean distance (msd) of different spiking subsets with the distribution density function of their vis–NIR spectra only and statistically inferred the optimal sampling set size to be 30. Our study showed that when the number of spiking were lower than 30, the predicted accuracy derived from global PLSR using CSSL spiked with and without extra-weighted samples was greater than the predicted accuracy derived from the general PLSR using the corresponding number of spiking samples only (RMSE 5.57–5.98 v.s. RMSE 6.76). Global PLSR using CSSL spiked with the statistically optimal local samples can achieve higher predicted performance (with a mean RMSE of 5.75). MBL spiked with five extra-weighted optimal spiking samples achieved the best accuracy with an RMSE of 3.98, an R2 of 0.70, a bias of 0.04 and an LCCC of 0.81. The msd is a simple and effective method to determine an adequate spiking size using only vis–NIR data.

Meihua Yang

and 2 more

In situ visible near infrared diffuse reflectance spectroscopy (VNIR) is a rapid and in-situ sensing approach and can provide analytical dense soil data reflecting multiple physical and chemical properties of soil. A total of 246 in situ soil samples were collected and scanned in 2016-2018. The dataset from 2016-2017 was used as the calibration dataset to develop the dry ground model and to develop to the in situ correction matrix using the dry and in situ spectra. The dataset from 2018 was used as the validation dataset using the in situ spectra. Four in situ correction methods, external parameter orthogonalization (EPO), direct standardization (DS), piecewise direct standardization (PDS), and generalized least squares weighting (GLSW) were used to remove the in situ effect on the spectra. In addition, two models, partial least squares regression (PLSR) and support vector machine (SVM), were used to detect the effectiveness of the prediction. The results showed that the four in situ corrections could remove the error introduced by in situ measurement to some extent. The four in situ corrections, when combined with SVM, could better reduce the errors caused by in situ measurements than the same corrections combined with PLSR. EPO correction outperformed the other three methods, and EPO-SVM obtained the best prediction with the lowest RMSE (1.91 g kg-1) and highest Lin’s concordance correlation coefficient (LCCC) (0.84). We conclude that the EPO-SVM methods using in situ spectra can detect soil organic carbon in the Poyang Lake area in a rapid and minimally invasive manner.

Hongyi Li

and 7 more

The upper Tarim River basin is supporting about 50 million people by melting the glaciers and snow, which are highly vulnerable and sensitive to climate change. Therefore, assessing the relative effects of climate change on runoff of this region is essential not only for understanding the mechanism of hydrological response over the mountainous areas in Southern Xinjiang but also for local water resources management. This study quantitatively investigated the climate change in the mountainous area of the upper Tarim River basin, using the up-to-date ‘ground-truth’ precipitation and temperature data, the APHRODITE (1961–2010, 0.25°) data; analyzed the potential connections between runoff data, observed at Alar station, and the key climatological variables; and discussed the regression models on simulating the runoff based on precipitation and temperature data. The main findings of this study are: (1) both annual precipitation and temperature generally increases at rates of 0.85 mm/year and 0.25 ℃/10a, respectively, while the runoff data measured at Alar station shows fluctuating decreasing trends; (2) there are significant spatial differences in the temporal trends of precipitation, for example, the larger increasing rates of precipitation occurs in the Karakoram Mountains, while the larger decreasing rates happens in northwestern of Kashgar County; (3) the decreasing trends of temperature mainly occurs in the Kashgar County and its surrounding areas in Summer; (4) seasonal correlations in trends of precipitation and temperature are more significant than those at monthly and annual scale; and (5) the regression model in simulating the runoff in the upper Tarim River basin based on Radial Basis Function (RBF) is better than that based on least-squares method, with the predictive values based on RBF models significantly better (Correlation coefficient, CC, ~ 0.85) than those by least-squares models (CC ~ 0.75). These findings will provide valuable information to inform environmental scientists and planners on the climate change issues in the upper Tarim River basin of Southern Xinjiang, China, under a semiarid-arid cl