Zeqing Huang

and 6 more

Machine learning methods provide a promising approach for exploiting relationships between raw forecasts and observations for forecast calibration. This paper highlights the role of data transformation in rainfall forecast calibration with machine learning. We develop a novel architecture that accounts for the positive skewness and zero bound of rainfall by incorporating a normalizing transformation (log-sinh) into distributional regression neural networks (NN). A unified loss function is formulated based on the negative log-likelihood function for parameter optimization. To test the importance of data transformation, we conduct five calibration experiments: one that does not use transformation at all (the baseline) while the others use the log-sinh transformation in different ways. All experiments are based on 10-day rainfall forecasts from the European Centre for Medium-range Weather Forecasts (ECMWF) from 2011 to 2022. Overall, the calibration methods effectively correct spatiotemporally varying biases in raw forecasts and improve reliability, yielding mean skill improvements of approximately 2% to 11% and in the best case reducing forecast biases to less than 2%. Without transformation, the baseline method suffers from forecast biases ranging from −30% to 50%, due to its limited ability to characterize the uncertainty of rainfall forecasts. Of the four experiments that use the log-sinh transformation, the optimal performance is achieved by the combined use of transforming raw forecasts for the input layer and utilizing fixed transformation parameters for generating calibrated forecasts in the output layer. We show that this method marginally outperforms an advanced existing Bayesian Ensemble Model Output Statistics method in reducing forecast biases.