Guoqiang Tang -

AI-based model emulators have emerged as a pragmatic strategy for calibrating Earth System models or their components (e.g., land, atmosphere, ocean), circumventing the previously insurmountable hurdle of the process-heavy models’ computational expense. Such emulators require large, spatially diverse datasets for training, however, which – in the land/hydrology context – contrasts with parameter estimation approaches that have traditionally emphasized optimizing model performance for individual basins, followed by similarity-based transfer schemes for parameter regionalization. Compared to calibrating basins individually, direct land/hydrology process model calibration approaches typically perform worse when trained jointly to large collections of basins. Building on insights from large-sample deep learning hydrologic modeling, this study introduces a Large-Sample Emulator (LSE) approach that unifies and streamlines process model parameter calibration and regionalization. Tested across 627 basins in the continental United States using the Community Terrestrial Systems Model (CTSM), the LSE approach consistently improves runoff predictions in all basins, outperforming the Single-Site Emulator (SSE) in both single-objective and multi-objective calibration tasks. Moreover, LSE-based regionalization in unseen basins, evaluated through spatial cross-validation, achieves better results than the default parameters in most cases. This LSE framework offers a promising strategy for effective large-domain process-based model calibration and regionalization.