4、Practical case study
4.1 Problem statement
To further illustrate the applicability of the subsampling ANOVA
methods, the proposed approaches are applied for parameter sensitivity
analyses in hydrological simulation through the conceptual model
GR4J,
as shown in Figure 4.
The
study area is Zengjiang River which is one tributary of Dongjiang River
located in the Pear River Delta, China as shown in Figure 5. The
meteorological data (daily evaporation and daily precipitation) are
collected from Qilinzui Hydrological Station (as shown in Figure 5)
during the period of 2009-2015 which were obtained from Hydrological
Data of Pearl River Basin, Annual Hydrology Report, P. R. China. The
total drainage area above the Qilinzui Hydrological Station is 2866
km2, accounting for 91% of the Zengjiang River basin
(3160 km2). The mean annual temperature and
precipitation are 21.6°C and 2188 mm, respectively. More details about
Zengjiang River basin can be found in (Tao
et al., 2011).
GR4J model is a rainfall-runoff model which is based on four free
parameters from daily rainfall data.
In
GR4J, the production components include an interception of raw rainfall
and potential evapotranspiration by an interception reservoir of null
capacity, a soil moisture accounting procedure to calculate effective
rainfall and a water exchange term to model water losses to or gains
from deep aquifers. Its routing module includes two flow components with
constant volumetric split (10–90%), two unit hydrographs, and a
non-linear routing store (as shown in Figure 4). The descriptions and
initial fluctuating ranges of GR4J model parameters are presented in
Table 1. For more details of
GR4J
model, please refer to the literature
(Perrin et al., 2003). The initial
fluctuating ranges of GR4J model parameters are wide considering the
structure varies in different basins. However, for a specific watershed,
the appropriate parameter range should be obtained through the
calibration process that produce an acceptable level of model
performance (Freer et al., 1996,
Pianosi et al., 2016,
Shin et al., 2013). It’s reported that
the parameters sensitivities were strongly influenced by the range of
parameter values used (Shin et al., 2013,
Wang et al., 2013), therefore it is
important to obtain an appropriate parameter range corresponding to
considered model performance before SA
(Shin et al., 2013,
Saltelli et al., 2019). Therefore, in
this study, the model calibration based on the MH algorithm is used
prior to SA in order to constraint the input variability space. The
details about MH algorithm are presented in supporting materials.
Nash–Sutcliffe model efficiency (NSE) is used to assess the predictive
power of model results which involves standardization of the residual
variance. Here, the objective functions adopted can be represented as
follows (Nash and Sutcliffe, 1970,
Legates and Jr, 1999):
where \(Q_{\text{sim}}\ \)is the simulated runoff, \(Q_{\text{obs}}\) is
the observed runoff, \(\overset{\overline{}}{Q_{\text{obs}}\ }\) is the
mean value of the observed runoff and \(n\) is the sample size.
For the MH algorithm parameterization, the markov chain with 10,000
iterations
for each parameter are examined and the model performance of each
iteration are presented in Figure 6. The NSE is greater than 0.74 and
remained stable after a number of iterations. In this study, the first
50% of the samples in markov chain are ruled out as a warm-up period.
The last 50% of the samples passed the Heidelberger and Welch
Convergence Diagnostics (Heidelberger and
Welch, 1983). The posterior distributions are obtained from the last
5000 parameter sets and the posterior PDFs of each parameter are
presented in Figure 7. All parameters ranges appear in a relatively
small interval which is different from their initial value. The
posterior PDFs of four parameters are approximately normal distribution
characteristics, indicating that the parameters in GR4J are well
identified after a number of iterations even with a wide range of prior
densities. The predictive intervals of stream obtained by the last 5000
parameter sets are presented in Figure 8. It can be observed that the
obtained predictive intervals can generally match the observations,
except for some overestimations in high-flow periods.
With the appropriate parameter range, the next work is to evaluate how
much each parameter contributes to the stream uncertainty. Sensitivities
of parameters in a rainfall–runoff model structure are specific to a
site, and cannot be assumed from previous work in other catchments
(Van Griensven et al., 2006,
Shin et al., 2013). In this study, the
proposed subsampling ANOVA methods are applied for analyzing parameters
sensitivities of GR4J model in Zengjiang River basin. Based on the
calibration of GR4J, the ranges of the four parameters are determined as
presented in Table 2. Similar to Section 3.1, different subsampling
ANOVA approaches,
including
single-subsampling ANOVA (5222, 2522, 2252 and 2225),
multiple-subsampling ANOVA (5522, 5252, 5225, 2552, 2525, 2255, 5552,
5525, 5255 and 2555), and full-subsampling ANOVA with different
parameters level (2222, 3333, 4444 and 5555) are defined and applied
here.
4.2 Influence of subsampled parameter
With only one parameter subsampling, the contributions of individual and
interactive effects for the four parameters in GR4J model are shown in
Figure 9. There are several findings as follows. Firstly, taking sobol’s
results as the reference results, X1 makes the largest
independent contribution to GR4J model uncertainty in Zenjiang River and
followed by the interactive effects of the four parameters.
X1(the first parameter) represents the maximum capacity
of the production store. The high sensitivity of parameter
X1 indicates that runoff generation in Zengjiang basin
is highly affected by the
maximum
capacity of the production store. The maximum capacity of the production
store (X1) increases to handle with an overestimation of
rainfall and decreases to handle with underestimation, thus adapting its
capacity to hold and evaporate different amounts of water
(Oudin et al., 2006). Secondly,
the
parameters sensitivities obtained change significantly with different
subsampling scheme. For example, the contributions of X1
are 0.109, 0.230, 0.275 and 0.205 for the four single-subsampling
schemes (where X1, X2,
X3, and X4 are subsampled
separately).
The lowest sensitivity value for X1 obtained in 5222,
which X1 decomposed into five levels and take
subsampling. The other three parameters have the same basic behaviors as
the X1. Therefore,
similar with section 3.2, the subsampling procedure also lead to a lower
sensitivity value for one parameter which is subsampled in. Thirdly, the
ranking of parameter sensitivity is influenced by different
single-subsampling schemes. In order to check if hierarchy is kept by
the methods more clearly, the bar plots which is close to Figure 9,but
grouped by simulations and not by parameter are presented in
supplementary materials Figure S4-S6. For instance, the sensitivity
order is Interactions >X3 >
X4 > X1 >
X2 in 5222 scheme, while in 2252 scheme, the sensitivity
order is Interactions >X1 >
X3 > X4 >
X2. Such results indicate that the single-subsampling
ANOVA approach generate unreliable sensitivity values, which are highly
influenced by the parameter to be subsampled. In multiple-subsampling
ANOVA, more than one parameters in GR4J model are subsampled. The
contributions of individual and interactive effects for GR4J model
parameters under different multiple-subsampling schemes are presented in
Figure 10.
It can be found that, for each parameters, the red bar values are
significantly lower than the blue bar
values.
The mean values of the red bars for
X1, X2, X3 and
X4 are 0.184, 0.033, 0.124 and 0.078, respectively.
Meanwhile the
mean
values for the blue bars for X1, X2,
X3 and X4 are 0.306, 0.098, 0.264 and
0.225, respectively. For each parameter, the mean value without
subsampling (blue bars) is more than twice as much as the mean value
with subsampling (red bars). This indicates that, similar with the
single-subsampling scheme, the multiple-subsampling ANOVA approaches
also generate unreliable results, and the subsampling-procedure would
significantly reduce the resulting individual sensitivity value. In
other words, the difference of sampling densities among parameters has
great influence on quantification of parametric sensitivities in
hydrologic modeling.
4.3 Influence of subsampled parameter level in
full-subsampling
In the full-subsampling ANOVA approach, all the four parameters in GR4J
model are subsampled. However, different levels for each parameter can
be chosen before the subsampling procedure. Similar with the analytic
case in section 3, four levels (2 to 5) are going to be chosen for each
parameter in GR4J. The contributions of individual and interactions for
GR4J model parameters under different parameter levels in
full-subsampling ANOVA are presented in Figure 11. As the parameter
levels increase from 2222 to 5555, the sensitivity of
X1, X2 and X4 gradually
increase from 20.1%, 3.7% and 4.7% to 31.0%, 7.6% and 15.8%,
respectively. At the same time, the contribution of X3
and the interaction gradually decrease from 21.7% to 17.8% and 48.9%
to 25.9%. The results indicate that the parameters levels will affect
the individual and interactive sensitivities in the full-subsampling
ANOVA approach. In details, the sensitivity of the most sensitive
parameter and interaction generally decrease, while that of the other
parameters increase with the parameter level increased. The obtained
results would vary most significantly when the parameter level increases
from 2 to 3. At the same time, the variation of the obtained results is
relatively small and the order of parameters sensitivity would not
change when the parameter levels are higher than three.
4.4 Compared with sobol’s
The sensitivity values for the four parameters in GR4J model from the
subsampling ANOVA (i.e. single-subsampling, multiple-subsampling and
full-subsampling) and sobol’s are discussed above. To compare the
subsampling ANOVA and sobol’s methods further, the deviations for
parameter sensitivity values and the calculation cost are presented in
Figure 12 and Table 3. There are significant differences between the
deviations obtained with different
subsampling ANOVA methods. The
large deviations (> 0.08) are obtained with 2225, 2255 and
2525, meanwhile small deviations (< 0.01) are obtained with
3333, 4444, and 5555. Except for 2222, other full-subsampling schemes
perform very well in the sensitivity analysis of GR4J model parameters.
Therefore, in order to get reliable parameter sensitivity results, the
three or more parameter levels in the full-subsampling ANOVA approach
are recommended.
In this GR4J model study, 3 million random samples are applied in
sobol’s in order to get a stable result of the parameters sensitivities.
So totally, the GR4J model need to run 3000000*(5+2) times, which is a
very large computational requirement. However, the subsampling ANOVA
methods can significantly reduce the calculation requirements while
achieve similar calculation accuracy in GR4J model. For example, in
full-sampling ”4444” where the deviation is only 0.0006, the model only
need to run 256 times (256=4*4*4*4). Similar with section 3.4, for the
four parameters conceptual model, 1296 sets variance results can be
obtained through subsampling process where 1296= and . The final
sensitivity results obtained by averaging and homogenizing the 1296 sets
of variance. Through reducing the number of model runs, the subsampling
ANOVA methods are effective and feasible sensitivity analysis methods
with relatively low computational requirements.