Yalan Song

and 18 more

The National Water Model (NWM) is a key tool for flood forecasting and planning and water management. Key challenges facing NWM include calibration and parameter regionalization when confronted with big data. We present two novel versions of high-resolution (~37 km2) differentiable models (a type of physics-informed machine learning): one with implicit, unit-hydrograph-style routing and another with explicit Muskingum-Cunge routing in the river network. The former predicts streamflow at basin outlets whereas the latter presents a discretized product that seamlessly covers rivers in the conterminous United States (CONUS). Both versions used neural networks to provide multiscale parameterization and process-based equations to provide structural backbone, trained them together (“end-to-end”) on 2,807 basins across CONUS, and evaluated them on 4,997 basins. Both versions show the great potential to elevate future NWMs for extensively calibrated as well as ungauged sites: the median daily Nash-Sutcliffe efficiency (NSE) of all 4,997 basins is improved to around 0.68 from 0.49 of NWM3.0. As they resolve heterogeneity, both greatly improved simulations in the western CONUS and also in the Prairie Pothole Region, a long-standing modeling challenge. The Muskingum-Cunge version further improved performance for basins >10000 km2. Overall, our results show how neural-network-based parameterizations can improve NWM performance for providing operational flood predictions while maintaining interpretability and multivariate outputs. We provide a CONUS-scale hydrologic dataset for further evaluation and use. The modeling system supports the Basic Model Interface (BMI), which allows seamless integration with the next-generation NWM.

Jiangtao Liu

and 4 more

Landslide risk is traditionally predicted by process-based models with detailed assessments or point-scale, attribute-based machine learning (ML) models with first- or second-order features, e.g., slope, as inputs. One could hypothesize that terrain patterns might contain useful information that could be extracted, via computer vision ML models, to elevate prediction performance beyond that achievable with low-order features. We put this hypothesis to the test in the state of Oregon, where a large landslide dataset is available. The image-processing convolutional neural networks (CNN2D) using 2D terrain data obtained either higher Precision or higher Recall than attribute-based random forest (RF1D) models, but could not improve both simultaneously. While CNN2D can be set up to identify more real events, it would then introduce more false positives, highlighting the challenge of generalizing landslide-prone terrain patterns and the potential omission of critical factors. However, ensembling CNN2D and RF1D produced overall better Precision and Recall, and this cross-model-type ensemble was better than other ways to ensemble, leveraging information content of fine-scale topography while suppressing its noise. These models further showed robust results in cross-regional validation. Our perturbation tests showed that 10m resolution (the smallest possible) produced the best model in a range of resolutions. Rainfall, land cover, soil moisture, and elevation were the most important predictors. Based on the results of the analysis, we generated landslide susceptibility maps, providing insights into spatial patterns of landslide risk.

Tadd Bindas

and 7 more

Recently, rainfall-runoff simulations in small headwater basins have been improved by methodological advances such as deep neural networks (NNs) and hybrid physics-NN models — particularly, a genre called differentiable modeling that intermingles NNs with physics to learn relationships between variables. However, hydrologic routing, necessary for simulating floods in stem rivers downstream of large heterogeneous basins, had not yet benefited from these advances and it was unclear if the routing process can be improved via coupled NNs. We present a novel differentiable routing model that mimics the classical Muskingum-Cunge routing model over a river network but embeds an NN to infer parameterizations for Manning’s roughness (n) and channel geometries from raw reach-scale attributes like catchment areas and sinuosity. The NN was trained solely on downstream hydrographs. Synthetic experiments show that while the channel geometry parameter was unidentifiable, n can be identified with moderate precision. With real-world data, the trained differentiable routing model produced more accurate long-term routing results for both the training gage and untrained inner gages for larger subbasins (>2,000 km2) than either a machine learning model assuming homogeneity, or simply using the sum of runoff from subbasins. The n parameterization trained on short periods gave high performance in other periods, despite significant errors in runoff inputs. The learned n pattern was consistent with literature expectations, demonstrating the framework’s potential for knowledge discovery, but the absolute values can vary depending on training periods. The trained n parameterization can be coupled with traditional models to improve national-scale flood simulations.

Doaa Aboelyazeed

and 5 more

Net photosynthesis (AN) is a major component of the global carbon cycle, with significant feedback to decadal-scale climate change. Although plant acclimation to environmental changes can modify AN, traditional vegetation models in Earth System Models (ESMs) often rely on plant functional type (PFT)-specific parameter calibrations or simplified acclimation assumptions, both of which lacked generalizability across time, space and PFTs. In this study, we propose a differentiable photosynthesis model to learn the environmental dependencies of Vc,max25, as this genre of hybrid physics-informed machine learning can seamlessly train neural networks and process-based equations together. Compared to PFT-specific parameterization of Vc,max25, learning the environment dependencies of key photosynthetic parameters improves model spatiotemporal generalizability. Applying environmental acclimation to Vc,max25 led to substantial variation in global mean AN, calling for the attention to acclimation in ESMs. The model effectively captured multivariate observations (Vcmax25, stomatal conductance gs, and AN) simultaneously and, in fact, multivariate constraints further improved model generalization across space and PFTs. It also learned sensible acclimation relationships of Vc,max25 to different environmental conditions. The model explained more than 54%, 57% and 62% of the variance of AN, gs, and Vcmax25, respectively, presenting a first global-scale spatial test benchmark of AN and gs. These results highlight the potential of differentiable modeling to enhanced process-based modules in ESMs and effectively leverage information from large, multivariate datasets.

savinay nagendra

and 9 more

In this article, we consider the scenario where remotely sensed images are collected sequentially in temporal batches, where each batch focuses on images from a particular ecoregion, but different batches can focus on different ecoregions with distinct landscape characteristics. For such a scenario, we study the following questions: (1) How well do DL models trained in homogeneous regions perform when they are transferred to different ecoregions, (2) Does increasing the spatial coverage in the data improve model performance in a given ecoregion (even when the extra data do not come from the ecoregion), and (3) Can a landslide pixel labelling model be incrementally updated with new data, but without access to the old data and without losing performance on the old data (so that researchers can share models obtained from proprietary datasets)? We address these questions by a framework called Task-Specific Model Updates (TSMU). The goal of this framework is to continually update a (landslide) semantic segmentation model with data from new ecoregions without having to revisit data from old ecoregions and without losing performance on them. We conduct extensive experiments on four ecoregions in the United States to address the above questions and establish that data from other ecoregions can help improve the model’s performance on the original ecoregion. In other words, if one has an ecoregion of interest, one could still collect data both inside and outside that region to improve model performance on the ecoregion of interest. Furthermore, if one has many ecoregions of interest, data from all of them are needed.

Yuan Yang

and 9 more

Accurate global river discharge estimation is crucial for advancing our scientific understanding of the global water cycle and supporting various downstream applications. In recent years, data-driven machine learning models, particularly the Long Short-Term Memory (LSTM) model, have shown significant promise in estimating discharge. Despite this, the applicability of LSTM models for global river discharge estimation remains largely unexplored. In this study, we diverge from the conventional basin-lumped LSTM modeling in limited basins. For the first time, we apply an LSTM on a global 0.25° grid, coupling it with a river routing model to estimate river discharge for every river reach worldwide. We rigorously evaluate the performance over 5332 evaluation gauges globally for the period 2000-2020, separate from the training basins and period. The grid-scale LSTM model effectively captures the rainfall-runoff behavior, reproducing global river discharge with high accuracy and achieving a median Kling-Gupta Efficiency (KGE) of 0.563. It outperforms an extensively bias-corrected and calibrated benchmark simulation based on the Variable Infiltration Capacity (VIC) model, which achieved a median KGE of 0.466. Using the global grid-scale LSTM model, we develop an improved global reach-level daily discharge dataset spanning 1980 to 2020, named GRADES-hydroDL. This dataset is anticipated to be useful for a myriad of applications, including providing prior information for the Surface Water and Ocean Topography (SWOT) satellite mission. The dataset is openly available via Globus.

Wen-Ping Tsai

and 4 more

Some machine learning (ML) methods such as classification trees are useful tools to generate hypotheses about how hydrologic systems function. However, data limitations dictate that ML alone often cannot differentiate between causal and associative relationships. For example, previous ML analysis suggested that soil thickness is the key physiographic factor determining the storage-streamflow correlations in the eastern US. This conclusion is not robust, especially if data are perturbed, and there were alternative, competing explanations including soil texture and terrain slope. However, typical causal analysis based on process-based models (PBMs) is inefficient and susceptible to human bias. Here we demonstrate a more efficient and objective analysis procedure where ML is first applied to generate data-consistent hypotheses, and then a PBM is invoked to verify these hypotheses. We employed a surface-subsurface processes model and conducted perturbation experiments to implement these competing hypotheses and assess the impacts of the changes. The experimental results strongly support the soil thickness hypothesis as opposed to the terrain slope and soil texture ones, which are co-varying and coincidental factors. Thicker soil permits larger saturation excess and longer system memory that carries wet season water storage to influence dry season baseflows. We further suggest this analysis could be formalized into a novel, data-centric Bayesian framework. This study demonstrates that PBM present indispensable value for problems that ML cannot solve alone, and is meant to encourage more synergies between ML and PBM in the future.

Tadd Bindas

and 7 more

Recently, runoff simulations in small, headwater basins have been improved by methodological advances such as deep learning (DL). Hydrologic routing modules are typically needed to simulate flows in stem rivers downstream of large, heterogeneous basins, but obtaining suitable parameterization for them has previously been difficult. It is unclear if downstream daily discharge contains enough information to constrain spatially-distributed parameterization. Building on recent advances in differentiable modeling principles, here we propose a differentiable, learnable physics-based routing model. It mimics the classical Muskingum-Cunge routing model but embeds a neural network (NN) to provide parameterizations for Manning’s roughness coefficient (n) and channel geometries. The embedded NN, which uses (imperfect) DL-simulated runoffs as the forcing data and reach-scale attributes as inputs, was trained solely on downstream hydrographs. Our synthetic experiments show that while channel geometries cannot be identified, we can learn a parameterization scheme for n that captures the overall spatial pattern. Training on short real-world data showed that we could obtain highly accurate routing results for both the training and inner, untrained gages. For larger basins, our results are better than a DL model assuming homogeneity or the sum of runoff from subbasins. The parameterization learned from a short training period gave high performance in other periods, despite significant bias in runoff. This is the first time an interpretable, physics-based model is learned on the river network to infer spatially-distributed parameters. The trained n parameterization can be coupled to traditional runoff models and ported to traditional programming environments.

Kai Ma

and 7 more

There is a drastic geographic imbalance in available global streamflow gauge and catchment property data, with additional large variations in data characteristics, so that models calibrated in one region cannot normally be migrated to another. Currently in these regions, non-transferable machine learning models are habitually trained over small local datasets. Here we show that transfer learning (TL), in the sense of weights initialization and weights freezing, allows long short-term memory (LSTM) streamflow models that were trained over the Conterminous United States (CONUS, the source dataset) to be transferred to catchments on other continents (the target regions), without the need for extensive catchment attributes. We demonstrate this possibility for regions where data are dense (664 basins in the UK), moderately dense (49 basins in central Chile), and where data are scarce and only globally-available attributes are available (5 basins in China). In both China and Chile, the TL models significantly elevated model performance compared to locally-trained models. The benefits of TL increased with the amount of available data in the source dataset, but even 50-100 basins from the CONUS dataset provided significant value for TL. The benefits of TL were greater than pre-training LSTM using the outputs from an uncalibrated hydrologic model. These results suggest hydrologic data around the world have commonalities which could be leveraged by deep learning, and significant synergies can be had with a simple modification of the currently predominant workflows, greatly expanding the reach of existing big data. Finally, this work diversified existing global streamflow benchmarks.

Kuai Fang

and 3 more

Recently, recurrent deep networks have shown promise to harness newly available satellite-sensed data for long-term soil moisture projections. However, to be useful in forecasting, deep networks must also provide uncertainty estimates. Here we evaluated Monte Carlo dropout with an input-dependent data noise term (MCD+N), an efficient uncertainty estimation framework originally developed in computer vision, for hydrologic time series predictions. MCD+N simultaneously estimates a heteroscedastic input-dependent data noise term (a trained error model attributable to observational noise) and a network weight uncertainty term (attributable to insufficiently-constrained model parameters). Although MCD+N has appealing features, many heuristic approximations were employed during its derivation, and rigorous evaluations and evidence of its asserted capability to detect dissimilarity were lacking. To address this, we provided an in-depth evaluation of the scheme’s potential and limitations. We showed that for reproducing soil moisture dynamics recorded by the Soil Moisture Active Passive (SMAP) mission, MCD+N indeed gave a good estimate of predictive error, provided that we tuned a hyperparameter and used a representative training dataset. The input-dependent term responded strongly to observational noise, while the model term clearly acted as a detector for physiographic dissimilarity from the training data, behaving as intended. However, when the training and test data were characteristically different, the input-dependent term could be misled, undermining its reliability. Additionally, due to the data-driven nature of the model, the two uncertainty terms are correlated. This approach has promise, but care is needed to interpret the results.

Farshid Rahmani

and 5 more

Stream water temperature (T) is a variable of critical importance and decision-making relevance to aquatic ecosystems, energy production, and human’s interaction with the river system. Here, we propose a basin-centric stream water temperature model based on the long short-term memory (LSTM) model trained over hundreds of basins over continental United States, providing a first continental-scale benchmark on this problem. This model was fed by atmospheric forcing data, static catchment attributes and optionally observed or simulated discharge data. The model achieved a high performance, delivering a high median root-mean-squared-error (RMSE) for the groups with extensive, intermediate and scarce temperature measurements, respectively. The median Nash Sutcliffe model efficiency coefficients were above 0.97 for all groups and above 0.91 after air temperature was subtracted, showing the model to capture most of the temporal dynamics. Reservoirs have a substantial impact on the pattern of water temperature and negative influence the model performance. The median RMSE was 0.69 and 0.99 for sites without major dams and with major dams, respectively, in groups with data availability larger than 90%. Additional experiments showed that observed or simulated streamflow data is useful as an input for basins without major dams but may increase prediction bias otherwise. Our results suggest a strong mapping exists between basin-averaged forcings variables and attributes and water temperature, but local measurements can strongly improve the model. This work provides the first benchmark and significant insights for future effort. However, challenges remain for basins with large dams which can be targeted in the future when more information of withdrawal timing and water ponding time were accessible.

Farshid Rahmani

and 4 more

Stream water temperature is considered a “master variable” in environmental processes and human activities. Existing process-based models have difficulties with defining true equation parameters, and sometimes simplifications like assuming constant values influence the accuracy of results. Machine learning models are a highly successful tool for simulating stream temperature, but it is challenging to learn about processes and dynamics from their success. Here we integrate process-based modeling (SNTEMP model) and machine learning by building on a recently developed framework for parameter learning. With this framework, we used a deep neural network to map raw information (like catchment attributes and meteorological forcings) to parameters, and then inspected and fed the results into SNTEMP equations which we implemented in a deep learning platform. We trained the deep neural network across many basins in the conterminous United States in order to maximize the capturing of physical relationships and avoid overfitting. The presented framework has the ability of providing dynamic parameters based on the response of basins to meteorological conditions. The goal of this framework is to minimize the differences between stream temperature observations and SNTEMP outputs in the new platform. Parameter learning allows us to learn model parameters on large scales, providing benefits in efficiency, performance, and generalizability through applying global constraints. This method has also been shown to provide more physically-sensible parameters due to applying a global constraint. This model improves our understanding of how to parameterize the physical processes related to water temperature.