3. Data description

3.1 Data overview

The CHOSEN database contains up to 13 different hydrometeorological variables, including streamflow, precipitation, air temperature, solar radiation, relative humidity, wind direction, wind speed, SWE, snow depth, vapor pressure, soil moisture, soil temperature, and isotope values, with availability varying from site to site (Figure 4). The HJ Andrews and Bonanza LTERs have measurements of all 13 variables, with most of the other watersheds having data of around ten hydrometeorological variables. Discharge record lengths range from three years at Calhoun to 78 years at the San Diego River (California Current Ecosystem LTER), with a median of 19 years.
Figure 4. Span of time series availability and duration across watersheds
Among all the 13 hydrometeorology variables included in the dataset, discharge, precipitation, snow depth, soil moisture, and isotope data are particularly important for hydrologic process studies. Discharge and precipitation time series are available in all CHOSEN watersheds (Figure 5), and seven catchments have soil moisture and snow measurements with records exceeding five years. Although publicly available isotope data are limited, we identified six watersheds with isotope time series longer than one year (Figure 4).
Figure 5. Distributions of record spans for different variables

3.1.1 Precipitation

In the CHOSEN dataset, 27 watersheds have more than five years of precipitation data, and 20 watersheds have more than ten years (Figure 5). Twenty-five watersheds have less than 10% missing precipitation values, increasing to 29 watersheds after applying gap-filling methods (Figure S1). The sparsest precipitation raw data are from the Bonanza site, where 24% of the missing values were filled by regression. More precipitation gap-filling information is available in the supplementary material.

3.1.2 Soil moisture

Soil moisture is essential for investigating hydrologic connectivity and runoff processes, especially where vertical flow dominates (Bracken et al., 2013). Soil moisture measurements are available in 18 watersheds (Figure 5), usually including multiple stations and depths. Seventeen of these catchments have less than 10% missing soil moisture data after gap-filling (Figure S2). The longest soil moisture records on average are in the HJ Andrews watershed, including multiple stations dispersed in several sub-watersheds monitoring at different depths. Like the HJ Andrews watershed, other sites commonly measure soil moisture data at multiple stations, facilitating gap-filling through spatial regression.

3.1.3 Snow depth / SWE

At high latitudes and altitudes, snowmelt can play an important role in streamflow generation and nutrient export, and snow accumulation and melt may be particularly sensitive to climate change. Eight of the CHOSEN watersheds have snow depth data with less than 10% missing values after gap-filling (Figure S3). Sagehen watershed has the longest snow depth record (61 years), with 39 years of SWE data (Table S2).

3.1.4 Isotope data

Isotope tracers (e.g., 18O and deuterium) are important for estimating catchment transit time distributions, which, along with hydrologic response timescales, can be used to characterize the temporal dynamics of the water cycle. Though publicly available isotope measurements are less abundant than hydrometeorological data, six of the CHOSEN watersheds have publically available isotope time series. Among those watersheds, Shale Hills has the longest isotope time series, consisting of 1103 days of isotope measurements between 2008-03-28 and 2011-12-31. Most of the sites have sub-weekly δ18O and δ2H measurements in precipitation and streamflow (Table S4).

3.2 Example data from Dry Creek watershed

This section presents example data from the Dry Creek Experimental Watershed (DCEW), located in the semi-arid southwestern region of Idaho, USA, 16 km northeast of the city of Boise. Raw data were downloaded from the Boise State University research pagehttps://www.boisestate.edu/drycreek/dry-creek-data/ . Daily measurements of discharge, precipitation, soil moisture, snow depth, and six other hydrometeorological variables were collected starting in 1999 at multiple streamflow gauges, weather stations, and soil moisture sensors distributed in this area (Figure 6).
Figure 6. Dry creek experimental watershed
(Source:https://www.boisestate.edu/drycreek/dry-creek-data/)
Over half of the variables at Dry Creek have less than 10% missing values at daily time steps. After applying gap-filling methods, all hydrometeorological variables except snow depth have less than 10% missing values (Figure 7). The sparsity of snow depth data is due to the ephemeral nature of the region’s snowpack.
Figure 7. Data filling methods applied to Dry Creek data
The intensively monitored data included in CHOSEN allow for detailed analyses of hydrometeorology variables at both the seasonal and interannual time-scale. Here, we briefly describe some of the gap-filled data from the 2011-2012 hydrological year at Dry Creek (Figure 8). For streamflow, the highest discharge values were monitored at Lower Gauge (LG) which is located downstream of the watershed. The lowest discharge values are from two tributaries: Treeline (TL) and Bogus South Gauge (BSG). Patterns of precipitation match the responses in streamflow, especially in January and April. Springtime snowmelt is reflected in both a decrease in snow depth and persistent high flows during March and April. The soil moisture time series vary greatly from station to station, but generally reflect seasonal patterns of precipitation, snowmelt, and evaporative demand, with shorter-term fluctuations in shallower sensors showing the influence of individual precipitation events. This example highlights how CHOSEN data can be instrumental in understanding the responses of soil moisture and discharge to hydrometeorological drivers.
Figure 8. Cleaned Dry Creek daily data from 2011-10-01 to 2012-09-30