loading page

Machine learning in coupled wildfire-water supply risk assessment: Data science toolkit
  • +5
  • Dennis Hallema,
  • Ge Sun,
  • Peter Caldwel,
  • François-Nicolas Robinne,
  • Kevin Bladon,
  • Rua Mordecai,
  • Sheila Saia,
  • Steven McNulty
Dennis Hallema
USDA Forest Service Southern Research Station

Corresponding Author:dwhallem@ncsu.edu

Author Profile
Ge Sun
USDA Forest Service Southern Research Station
Author Profile
Peter Caldwel
USDA Forest Service Southern Research Station
Author Profile
François-Nicolas Robinne
University of Alberta
Author Profile
Kevin Bladon
Oregon State University
Author Profile
Rua Mordecai
U.S. Fish & Wildlife Service
Author Profile
Sheila Saia
USDA Forest Service Southern Research Station
Author Profile
Steven McNulty
USDA Forest Service Southern Research Station
Author Profile

Abstract

The frontier of wildfire-related risk assessment is moving into data science territory, and with good reason. Computational statistics, built on a foundation of high resolution remote sensing data, ground data, and theory, forms the basis of powerful risk assessment tools. The need for data based risk assessment has increased in past years, in view of longer wildfire seasons in the U.S., associated with more frequent droughts, more human ignitions and accumulating fuel loads. We present an application of machine learning (ML), which makes it possible to analyze complex data without a priori definition of interactions—this is a major advantage because these interactions are not known beforehand. Specifically, we build a stochastic gradient boosting machine (GBM) toolkit to assess the change in river flow after wildfire in the contiguous United States (CONUS) over a 5-year period. The GBM accounts for nonlinear relationships and interactions between wildland fire characteristics, watershed geometry, climate variability, topography and land cover. Building the GBM is a sequential process where a loss function is minimized at each fold, along a gradient defined by pseudo-residuals. This process allows the program to progressively learn more about how the variables in the large dataset interact to result in the response (i.e., river flow). Our results show that wildfires increase annual river flow in the CONUS when more than 20% of a gaged basin is burned. Data science tools like the GBM presented here, are essential in generating practical knowledge on how wildfire impacts on ecohydrology can ultimately affect hydrological services, socio-hydrosystems and water security in fire-affected regions.