Soil contamination by heavy metals has become a significant issue threatening the ecological security of global agriculture, particularly in arid regions, where accurate monitoring of low-concentration heavy metals remains a technical challenge. This study proposes a proximal sensing method based on the fusion of visible-near infrared (Vis-NIR) spectroscopy and portable X-ray fluorescence (pXRF) sensors, aiming to address the limitations of traditional single sensors in predicting low-concentration heavy metals in arid farmland areas. Using 116 farmland soil samples from the Qapqal Xibe Autonomous County in Xinjiang, the study systematically evaluates the modeling effects of 225 spectral preprocessing combinations on predicting four heavy metals: arsenic (As), lead (Pb), copper (Cu) and cadmium (Cd). The study found that Vis-NIR spectroscopy outperforms pXRF in predicting low-concentration heavy metals, and after data fusion, Vis-NIR with 1.75-order differential preprocessing achieved the best prediction performance. On the other hand, pXRF is not suitable for fractional order differentiation (FOD) preprocessing. The model accuracy was significantly improved by employing differentiated spectral preprocessing combinations, particularly for As, with an R 2 of 0.72, LCCC of 0.76, and RPIQ of 3.27. Furthermore, the analysis of critical characteristic bands revealed that the characteristic bands of As, Pb and Cu are mainly concentrated in the low-energy region (5-16 keV) of pXRF, providing an essential spectral basis for heavy metal feature extraction. This study innovatively proposes differentiated preprocessing strategies and highlights the critical role of pXRF low-energy region spectra in heavy metal prediction. The research provides a scientific basis for heavy metal monitoring and ecological risk assessment of farmland in arid areas, which has significant practical value, contributing to improved environmental quality and the safety of agricultural products.