Tao Zhang

and 7 more

Parameterizations in Earth System Models (ESMs) are subject to biases and uncertainties arising from subjective empirical assumptions and incomplete understanding of the underlying physical processes. Recently, the growing representational capability of machine learning (ML) in solving complex problems has spawned immense interests in climate science applications. Specifically, ML-based parameterizations have been developed to represent convection, radiation and microphysics processes in ESMs by learning from observations or high-resolution simulations, which have the potential to improve the accuracies and alleviate the uncertainties. Previous works have developed some surrogate models for these processes using ML. These surrogate models need to be coupled with the dynamical core of ESMs to investigate the effectiveness and their performance in a coupled system. In this study, we present a novel Fortran-Python interface designed to seamlessly integrate ML parameterizations into ESMs. This interface showcases high versatility by supporting popular ML frameworks like PyTorch, TensorFlow, and Scikit-learn. We demonstrate the interface’s modularity and reusability through two cases: a ML trigger function for convection parameterization and a ML wildfire model. We conduct a comprehensive evaluation of memory usage and computational overhead resulting from the integration of Python codes into the Fortran ESMs. By leveraging this flexible interface, ML parameterizations can be effectively developed, tested, and integrated into ESMs.

Jianda Chen

and 4 more

In recent years, machine learning (ML) models have been used for improving physical parameterizations of general circulation models (GCMs). A significant challenge of integrating ML models into GCMs is the online instability when they are coupled for long-term simulation. In this study, we present a new strategy that demonstrates robust online stability when the entire physical parameterization package of a GCM is replaced by a deep ML algorithm. The method uses a multistep training scheme of the machine learning model with experience replay in which the memory of physical tendencies from the training dataset and the ML algorithm’s own output at the previous time step are used in the training. The physics memory improves the accuracy of the machine learning model, while the experience replay constrains the amplification of cumulative errors in the online coupling. The method is used to train the whole physical parameterization package for the Community Atmosphere Model version 5 (CAM5) with data from its Multi-scale Modeling Framework (MMF) high resolution simulations. Three 6-year online simulations of the CAM5 with the ML physics package at operational spatial resolution with real-world geography are presented. The simulated spatial distributions of precipitation, surface temperature and zonally averaged atmospheric fields demonstrate overall better accuracy than that of the standard CAM5 and benchmark model even without the use of additional physical constraints or tuning. This work is the first to demonstrate a solution to address the online instability problem in climate modeling with ML physics by using experience replay.

Tao Zhang

and 8 more