loading page

A Comparison of Regression Methods for Inferring Near-Surface NO2 with Satellite Data
  • +4
  • Eliot J Kim,
  • Tracey Holloway,
  • Ajinkya Kokandakar,
  • Monica Harkey,
  • Stephanie Elkins,
  • Daniel Goldberg,
  • Colleen Heck
Eliot J Kim
University of Wisconsin-Madison

Corresponding Author:ejkim23@wisc.edu

Author Profile
Tracey Holloway
University of Wisconsin-Madison
Author Profile
Ajinkya Kokandakar
University of Wisconsin-Madison
Author Profile
Monica Harkey
University of Wisconsin-Madison
Author Profile
Stephanie Elkins
Massachusetts Institute of Technology
Author Profile
Daniel Goldberg
George Washington University
Author Profile
Colleen Heck
University of Wisconsin-Madison
Author Profile

Abstract

Nitrogen dioxide (NO2) is emitted during high temperature combustion from anthropogenic and natural sources. Human exposure to high NO2 concentrations causes cardiovascular and respiratory illnesses. The EPA operates ground monitors across the U.S. which take hourly measurements of NO2 concentrations, providing precise measurements for assessing human pollution exposure but with sparse spatial distribution. Satellite-based instruments capture NO2 amounts through the atmospheric column with global coverage at regular spatial resolution, but do not directly measure surface NO2. This study compares regression methods using satellite NO2 data from the TROPospheric Ozone Monitoring Instrument (TROPOMI) to estimate annual surface NO2 concentrations in varying geographic and land use settings across the continental U.S. We then apply the best-performing regression models to estimate surface NO2 at 0.01o by 0.01o resolution, and we term this estimate as quasi-NO2 (qNO2). qNO2 agrees best with measurements at suburban sites (cross-validation (CV) R2 = 0.72) and away from major roads (CV R2 = 0.75). Among U.S. regions, qNO2 agrees best with measurements in the Midwest (CV R2 = 0.89) and agrees least in the Southwest (CV R2 = 0.65). To account for the non-Gaussian distribution of TROPOMI NO2, we apply data transforms, with the Anscombe transform yielding highest agreement across the continental U.S. (CV R2 = 0.78). The interpretability, minimal computational cost, and health relevance of qNO2 facilitates use of satellite data in a wide range of air quality applications.