loading page

Uncertainty Quantification in Machine Learning and Nonlinear Least Squares Regression Models
  • Ni Zhan,
  • John Kitchin
Ni Zhan
Carnegie Mellon University

Corresponding Author:nzhan@andrew.cmu.edu

Author Profile
John Kitchin
Carnegie Mellon University
Author Profile

Abstract

Machine learning (ML) models are valuable research tools for making accurate predictions. However, ML models often unreliably extrapolate outside their training data. We propose an uncertainty quantification method for ML models (and generally for other nonlinear models) with parameters trained by least squares regression. The uncertainty measure is based on the multiparameter delta method from statistics, which gives the standard error of the prediction. The uncertainty measure requires the gradient of the model prediction and the Hessian of the loss function, both with respect to model parameters. Both the gradient and Hessian can be readily obtained from most ML software frameworks by automatic differentiation. We show that the uncertainty measure is larger for input space regions that are not part of the training data. Therefore this method can be used to identify extrapolation and to aid in selecting training data or assessing model reliability.

22 Jul 2021Submitted to AIChE Journal
26 Jul 2021Submission Checks Completed
26 Jul 2021Assigned to Editor
05 Aug 2021Reviewer(s) Assigned
09 Sep 2021Editorial Decision: Revise Minor
28 Sep 20211st Revision Received
02 Oct 2021Submission Checks Completed
02 Oct 2021Assigned to Editor
05 Oct 2021Reviewer(s) Assigned
24 Oct 2021Editorial Decision: Accept
Jun 2022Published in AIChE Journal volume 68 issue 6. 10.1002/aic.17516