Results and Discussion

3.1 Comparison of laboratory evaluation of gully suspended sediment monitoring methods

The laboratory evaluation of the various monitoring methods (flow-proportional discrete manual sampling, simulated RS sampler, PASS sampler, autosampler, and turbidity logger) demonstrated the capabilities and limitations of the methods to provide representative measurements of suspended sediment concentration and particle size distribution (Table 2, Figure 3, SI-4) as discussed in relevant sections below. The scientific literature considers discrete manual, isokinetic depth and width integrated, sampling to be the most representative field sample collection method (Horowitz et al., 2008; Perks, 2014; Ward et al., 1990). For this reason, we argue that assessment of sampler performance under laboratory conditions should be made by comparison to the discrete manual sampling results. The flow-proportional discrete manual samples collected during the laboratory evaluation are comparable to what would be collected using isokinetic manual sampling techniques in the field (Ward et al., 1990).
3.1.1 Autosampler
The time-weighted average suspended sediment concentration of the samples collected using the autosampler underestimated the manual discrete sample time-weighted average suspended sediment concentration by 38% and was also lower than the other tested methods (Table 2). The coarser sediment fraction (100-2000 µm) was also underrepresented in the samples collected by the autosampler (Figure 3, Table 2). This is due to increased head pressure and slower sampling velocity as a result of the elevation difference between the autosampler and its sample intake. Thus, heavier particles (i.e., sand) were under-represented in the samples collected with the autosampler (Bent et al., 2003; Clark et al., 2009; Fowler et al., 2009). These samples also had different suspended sediment concentrations and particle size distribution to comparable samples collected by the other methods (Figure 3). The finer fraction of sediment (the 10th percentile (d10 ) of the particle size distribution) within the samples, collected using the autosampler, appears to be similar to the discrete sample sediment d10 , however, the two datasets were significantly different (Table 2). Additionally, the median sediment particle size (d50 ) and 90th percentile (d90 ) of samples collected using the autosampler were generally close to half or less of those sediments collected by the other methods (Table 2). These data indicate that unless an autosampler can be configured so that the level of its intake is close to that of the sampling unit there will likely be under-representation of larger suspended sediment particles (>100 µm) and therefore also the suspended sediment concentration in the collected samples. This limitation suggests that suspended sediment data collected using an autosampler from a gully with high channel banks should be corrected using comparable data from a more representative method (e.g., manual sampling).
3.1.2 Rising stage sampler
The time-weighted average suspended sediment concentration derived from RS sampler data was biased to a higher sediment concentration (32%) compared to the time-weighted average suspended sediment concentration of the manually collected samples (Table 2). This bias was expected as samples were not collected after the simulated peak stage (i.e., 75 mins). The particle size distribution was not significantly different to the discrete manual sample data, as previously discussed in the methods, and it was also similar to the PASS sample data (Table 2, Figure 3).
The RS sampler provides representative individual sample data, however, the often rapid sampling rate due to gullies having a fast rising stage and lack of falling stage data will likely result in an overestimation of suspended sediment concentration and a potentially unrepresentative PSD for a flow event (García‐Comendador et al., 2017; Shellberg et al., 2013). However, we note that this laboratory simulation represents only one type of hydrograph that may occur in gully systems, so the suitability of the RS sampling approach should be considered on a case-by-case basis using available data on the relationship of suspended sediment concentration and flow at a particular field site.
3.1.3 PASS sampler
The time-weighted average suspended sediment concentration of the samples collected using both discrete and PASS sampling methods differed by only 9% ± 5% (Table 2). The suspended sediment concentration of the sample water expelled (i.e., water not retained) by the PASS sampler was 150 mg/L, which is equivalent to the sampler retaining 98.5 ± 1% of the total sediment sampled. The modifications made to the PASS sampler, therefore, have not hindered its ability to collect a representative sample of time-weighted average suspended sediment concentration and particle size distribution.
The particle size distribution statistics (i.e.,d10, d50, and d90 ) of the suspended sediment collected using the PASS and discrete sampling methods reveal generally good agreement between the two methods (Table 2). The distribution of fine particles < 10 µm were almost identical, whereas distributions of larger (heavier) particles differed with increasing size (Figure 3). This difference is likely due to the heterogeneity in sand particles in suspension within the agitation vessel during the test. The continuous collection of sediment by the PASS sampler should more accurately incorporate this heterogeneity into the final measurement compared to discrete sampling, which likely explains the difference in the coarser sediment particle size fractions collected by the PASS and discrete sampling methods (Figure 3).
Overall, our data suggests that the PASS sampler is capable of collecting a time-integrated sediment sample that is comparable in suspended sediment concentration and particle size distribution to that collected by isokinetic manual sampling approaches, under controlled laboratory conditions.
3.1.4 Turbidity Logger
Turbidity measurements and discrete sample suspended sediment concentrations had a strong linear relationship (R2 = 0.97), indicating that a predictive relationship between turbidity and suspended sediment could be used to estimate SSC from turbidity data (SI-5). Simulated RS sampler sample suspended sediment concentrations also had a strong linear relationship with turbidity (R2 = 0.94), however, this was only for three paired measurements, which is not sufficient to derive a predictive relationship between turbidity and suspended sediment concentration (Rasmussen et al., 2009). Suspended sediment concentrations of the samples collected with the autosampler showed a more variable relationship with turbidity measurements (R2 = 0.87) (SI-5). The time-weighted average suspended sediment concentration derived from turbidity data corrected with manually collected discrete samples compared well to the PASS (within 11%) and RS samples (within 26%) (SI-4). These results suggest the turbidity logger may be a good surrogate for the other monitoring methods provided a significant relationship between suspended sediment concentration and turbidity can be obtained under field conditions.

3.2 Field evaluation of gully monitoring methods

The two gullies at the field site were investigated over two wet seasons (2017/2018 and 2018/2019). During this time several flow events of different intensities were monitored (SI-6). Due to the remote locations of gullies used in this study, samples were often only able to be retrieved after multiple flow events had occurred, rather than after individual flow events. As such, there were only a limited number of single flow events that could be used to directly compare the performance of the various monitoring methods.
3.2.1 Autosampler
The autosampler collected samples in gully-2 with suspended sediment concentrations and particle size distributions that were similar to the other methods. The lack of suspended sand in gully-2 (commonly less than 2% by sample volume) meant that samples were representative despite the sampling unit being elevated (>1.5 m) relative to the intake (Table 3 and Figure 4). In contrast, samples collected using the autosampler from gully-1 had similar characteristics to those observed in the laboratory test, where suspended sediment concentration and particle size distribution were different to the PASS and RS samples when a relatively large amount of suspended sand was present (>20%) (Table 3). For example, during a short and intense flow event during the Jan-18 to Feb-18 sampling period in gully-1, samples collected by the autosampler underestimated the time-weighted average suspended sediment concentration by ~30% compared to the PASS sampler (Table 3, SI-6). Conversely, flow events that had relatively lower proportions of suspended sand (<10%) compared well to PASS sampler and RS sampler estimates. These differences in sample suspended sediment concentration and particle size distribution are consistent with observations from the laboratory test where the autosampler was unable to collect representative samples of the coarser sediment fractions due to the vertical displacement between the sampler position and its inlet. Additionally, the autosampler had several operational issues (e.g., insect infestation, sample intake blockages, and programming malfunctions) that limited the number of samples it collected in these specific field settings.
3.2.2 Rising Stage Sampler
The remote location of the study site meant the RS sampler arrays (i.e., six samplers) were only collected three times during the study period. This highlights the challenge of gaining sufficient samples for more than a small number of flow events from a gully using this method compared to the autosampler and PASS sampler, which can sample multiple flow events per deployment.
Based on the results of the laboratory evaluation, samples collected using the RS sampler were expected to be more representative of actual suspended sediment concentrations compared to samples collected by the autosampler. This was valid for most samples, however, under the field conditions prevailing at the study site some of the RS samplers were observed to accumulate large quantities of water (between 25-35% of the 1 L sampler volume) due to condensation. This phenomenon was unpredictable and resulted in suspended sediment samples being diluted by unknown amounts of water, thus potentially introducing significant error to the calculated SSC. Condensation in RS samplers has been noted in previous studies (Edwards et al., 1999), however, these comparatively large accumulations of condensate are likely caused by the high ambient daytime air temperatures and relative humidity, followed by cooler night time temperatures (a change of ~18°C), at the study site. This is likely to be an issue at many sites located in tropical regions and should be considered when designing monitoring programs in such places.
Unfortunately, upon return to a remote site following a flow event, there is no way of knowing which, if any, or to what degree individual samples collected by the RS samplers were affected by condensation. Considering this, it is best to interpret the RS sample suspended sediment concentration data with approximately 25-30% uncertainty. The RS samples had suspended sediment concentrations and particle size distributions in the range of the autosampler and PASS sampler samples (Table 3, Figure 4), although it is possible some of the suspended sediment concentrations could be outside of that range if condensation is considered. RS samples demonstrated the variability in particle size distribution under different water depth conditions well. For example, during a flow event in gully-1, the particle size distribution shifted between being dominated by finer and coarser particle as the water level increased (e.g., sample d50 andd90 ranged between 6.24 to 11.8 and 59.9 to 116, respectively) (SI-7). This ability to obtain information on suspended sediment particle size dynamics is a strength of the RS sampler approach.
Overall, suspended sediment concentration (provided the sampler is not compromised by condensation) and sediment particle size data of the RS samples compared well with the PASS sampler in both gully types. The development of a falling stage sampler has been recently reported, although no assessment of its limitations or capabilities has been done to date (DPI, 2017). Such a sampler could address a major limitation of using RS samplers for monitoring sediment transport processes in gullies.
3.2.3 PASS sampler
The particle size distribution of the samples collected from gully-2 by the autosampler, RS sampler, and PASS sampler were all very similar for all flow events (Table 3 and Figure 4). The average particle size distribution of the samples collected by the autosampler and PASS sampler were often within the uncertainty of their respective particle size distribution statistics (d10, d50, d90 ) (Table 3). This data confirms the observations of the laboratory test in that the PASS sampler is collecting a sample comparable to the other methods for both time-weighted average suspended sediment concentration and particle size distribution of fine suspended sediment (< 63 µm).
The PASS sampler, RS sampler, and autosampler data did not agree as well for samples collected from gully-1, where the higher percentage of suspended sand present during flows resulted in more variable suspended sediment concentrations and particle size distributions (Table 3, Figure 4). Despite this, the range of time-weighted average suspended sediment concentrations of PASS samples compared relatively well with the other methods for flow events with less suspended sand (e.g., flow events sampled between November 2017 and January 2018) (Table 3). The particle size distribution of coarser sediment (i.e., thed90 ) measured for the PASS samples were typically more than double those measured on the RS and autosampler samples, which indicates that the latter methods likely under-represented the coarser suspended sediment fraction in gully-1. The time-weighted average design of the PASS sampler means it cannot provide information on suspended sediment dynamics during a flow event. However, the PASS sampler is well-suited for investigating long-term trends in suspended sediment concentration and particle size distribution (e.g., several wet seasons), and for assessing the effectiveness of gully remediation works. Comparison of the laboratory and field data of the PASS sampler to the autosampler and RS sampler shows the method provides the most representative time-integrated suspended sediment data of the three methods and because the PASS sampler data was most consistent with manually collected samples.
3.2.4 Turbidity Logger
The turbidity logger can provide a high frequency of suspended sediment concentration measurements over extended time periods (e.g., months), provided there are sufficient comparable physical samples collected to ensure accurate calibration of the method (Rasmussen et al., 2009). There were some instances, at gully-2, where turbidity measurements could have been corrected to suspended sediment concentration measurements, using samples collected by the autosampler (R2 > 0.83 (SI-8)). However, this characteristic was not reflected in the measurements collected from gully-1, where the relationship between the autosampler sample suspended sediment concentrations and the turbidity logger measurements was poor (R2=0.17 (SI-8)).
The lack of a relationship between turbidity and SSC at gully-1 was likely due to the higher proportion of sand at this site. The turbidity measurement method is based on the detection of light intensity, originally emitted from the instrument, refracted from a particle back to the instrument detector. A study by Rasmussen et al. (2009) found the presence of fine to very coarse sand (125-2000 µm) can often a negatively bias turbidity measurements because the larger particles do not reflect light in a manner that is consistent with that used to calibrate the instrument (Rasmussen et al., 2009). This measurement characteristic often leads to an underestimation of the turbidity-suspended sediment concentration relationship (Bent et al., 2003; Clark et al., 2009; Fowler et al., 2009).
Without site-specific calibration, turbidity measurements are unlikely to be suitable for even semi-quantitative investigations of suspended sediment dynamics in gully systems. This is evidenced by the lack of significant difference between the turbidity measurements of the loggers located in the two studied gullies (SI-9), despite very different suspended sediment concentration ranges and PSDs (Table 3; Figure 4). For example, the mean turbidity of gully-1 (1250 (± 1173) NTU) and gully-2 (1501 (± 994) NTU), for the 2017/2018 wet season, were not significantly different, yet the SSCs measured by the other methods differed by ~4 to 7-fold between these gullies (Table 3). This emphasises the importance of collecting representative suspended sediment concentration samples in-order to calibrate the turbidity measurement to a surrogate suspended sediment concentration. Turbidity measurements alone do not provide useful information and thus should only be relied upon as a complimentary addition to other monitoring methods (e.g., RS or PASS samplers).
3.2.5 Comparison to manual sampling
The collection of manual samples from gullies is often difficult due to the remote location of the sites, safety concerns, and the unpredictability of flow events. However, samples were able to be collected from a single flow event in gully-1. Seven samples were manually collected during this event using a DH-48 sampler, and one time-integrated sample was collected over the same period by a PASS sampler deployed in the gully. There was little difference between average particle size distributions (Table 4, Figure 5) and the time-weighted average suspended sediment concentrations of the manually collected samples (6067 mg L-1) and PASS sample (6082 mg L-1), respectively. While these data are preliminary, it further supports the ability of the PASS sampler to collect representative samples of time-weighted average suspended sediment concentration and particle size distribution in challenging field settings.