AUTHOREA
Log in Sign Up Browse Preprints
LOG IN SIGN UP
Colin James Lee
Colin James Lee

Public Documents 1
Clustering analysis of very large measurement and model datasets on high performance...
Colin James Lee
Paul A. Makar

Colin James Lee

and 2 more

April 03, 2025
Hierarchical agglomerative clustering is a useful analysis technique which allows for a level of stability, interpretability and flexibility not available in other similar analysis techniques such as K-means, density-based clustering or positive matrix factorization. Previous studies using hierarchical clustering on atmospheric model output have been limited to small domain sizes (roughly 100x100 grid cells) by the computational expense and memory requirements of the algorithm. Here we present hierarchical clustering analysis on two atmospheric datasets which are much larger than was previously possible. In the first case study, we perform clustering on an entire year’s worth of hourly model simulated concentration and deposition data. The model domain covers the Canadian provinces of Alberta and Saskatchewan and has a size of 538x540, making for 290,520 hourly concentration timeseries. The resulting maps identify regions within the modelling domain within which forecast time-series are similar according to the chosen metric – demonstrating the analysis methodology’s ability to objectively and quantitatively define “airsheds” across a larger domain than has been possible before. The identified airsheds differ depending on species.

| Powered by Authorea.com

  • Home