Analytical case
3.1. Problem statement
A simple model with three unknown parameters is employed to illustrate
the
proposed
subsampling ANOVA approaches, which is expressed as follows:
where and are independent variables uniformly distributed within [0,
1]. This simplified model is proposed by
(Chen et al., 2019). The purpose of this
model is to explore the sensitivity indices change of model parameters
with different subsampling methods in the ANOVA-based sensitivity
analysis. In our study, we define “5” as the five levels are selected
equidistantly within the initial parameter range firstly. Then the five
levels are subsampled (see section 2.1), and totally 10 () combinations
of different level pairs are obtained for two-level ANOVA. Similarly,
“2” represents only two levels (maximum and minimum values) of the
parameter were selected from the range, without subsampling. For
example, “522” means that five levels of X1 are
selected with equidistantly from the range before subsampling, meanwhile
only two levels of the X2 and X3 are
selected from the range. In turn, we
define 252, 225, 552, 525, 255, 222, 333, 444 and 555 for different
ANOVA approaches. For 522, 252 and 225, only one of the three parameters
is subsampled, which represent single-subsampling ANOVA. For 552, 525
and 255, two of the three parameters are subsampled, which represent
multiple-subsampling ANOVA scheme. Similarly, 222,333,444 and 555
represent full-subsampling ANOVA with different parameters levels.
3.2 Influence of subsampled
parameter
Figure 1. presents sensitivity indices of individual and interactions of
the three parameters under different subsampling ANOVA approaches.
Figure 1(a) represents single-subsampling ANOVA and Figure 1(b)
represents multiple-subsampling ANOVA. Firstly, it can be found that
the
parameter’s sensitivity varies with each other. In detail, the
sensitivity range of and interactions are 4.1%-41.2%, 25.1%-78.5%,
7.5%-47.3% and 7.0%-15%, respectively. In most cases,
X2 is the most sensitive parameter. Secondly, the
parameter’s individual sensitivity varied significantly with different
subsampling scheme. For single-subsampling ANOVA, the minimum value
(the
red bar) of X1’s sensitivity is obtained in 522 where
only X1 is subsampled. Similarly, the minimum values
(the red bar) of X2’s and X3’s
sensitivities are obtained in 252 and 225, respectively. The results
indicate that the individual sensitivity of the parameter will reduce
sharply when the parameter is subsampled in single-subsampling ANOVA. As
for multiple-subsampling ANOVA in Figure 1(b), the maximum value (blue
bar) of X1’s sensitivity is obtained in 255 where only
X1 is non-subsampled. Similarly, the maximum values of
X2’s and X3’s sensitivities are obtained
in 525 and 552, which indicate that in multiple-subsampling ANOVA, the
individual sensitivity will increase for the non-subsampled parameter.
Thirdly, the black bars in Figure 1 represent sensitivity indices of
individual and interactions for the three parameters obtained by
Sobol’s. Compared with sobol’s results, the subsampling process will
reduce the subsampled parameter’s individual sensitivity and increase
the non-subsampled parameter’s individual sensitivity. Lastly the
subsampling process not only change the value of parameter sensitivities
but also change the ordering of the parameter sensitivities (as shown in
supporting masteries Figure S1-S3). For example, the order of
sensitivity for the case by the 522 method is parameter x2
> x3 > interaction > x1 while 252
values yield a slightly different order: x3 > x1
> x2 > interaction. This also indicates that
the results of either single- or multiple-subsampling schemes are
biased. Consequently, the full-subsampling ANOVA approach is expected to
employ in the following part aims to diminish the deviation.
3.3 Influence of parameter
levels
In the full-subsampling ANOVA approach, different levels can be chosen
for each parameter from its variation range. In this study, four
scenarios would be tested with each parameter having 2, 3, 4 or 5 levels
(i.e. 222, 333, 444 and 555) respectively. Figure 2 shows the influence
of parameters levels on individual and interactions sensitivity. The
sensitivities of three parameters change with the parameters levels
change. As the parameters levels
increase from 222 to 555, the individual sensitivity of
X1 and X3 gradually increase from 11.7%
and 19.4% to 19.1% and 24.1%, respectively. At the same time, the
interactive parameter sensitivity gradually decrease from 18.1% to
5.5%. The individual sensitivity of X2 which has the
biggest contribution keeps relatively stable, ranging from 50.9% to
52.2%.
The results show that for full–subsampling ANOVA method, the individual
and interactive parameters sensitivities are affected by the
subsampled
parameters levels. The increased parameters levels increase the
sensitivity value slightly for the low sensitive parameter and decrease
the interactive sensitivity. Another thing to watch out is that the
order of parameters sensitivities would change when the parameter level
increases from 2 to 3. While when the 3 or more parameter levels are
chosen, the variation of the obtained results is relatively small and
the order of parameters sensitivities remained consistent with that of
sobol’s. As
a whole, the full-subsampling
ANOVA approach with more than 3 levels is suggested to diminish the
deviation.
3.4 Comparison with sobol’s
method
To evaluate the accuracy of different subsampling ANOVA approaches, the
sobol’s method is used as a benchmark method, which is widely used in
hydrological models (Zhang et al., 2013,
Wang et al., 2018,
Song et al., 2015,
Sobol’, 2010) as an effective approach to
globally characterize single- and multiple-parameter interactive
sensitivities (Tang et al., 2007). In
this study, take sensitivity indices calculated by sobol’s method as
base values, the deviation between subsampling ANOVA and sobol’s can be
evaluated as , where is the sensitivity indices calculated by the
subsampling ANOVA approaches, is the sensitivity indices calculated by
sobol’s method. All the sensitivity indices calculated by subsampling
ANOVA and sobol’s are available in supporting material and the
deviations between subsampling ANOVA and sobol’s methods are presented
in Figure 3.
The deviations between results of
subsampling ANOVA and sobol’s vary (0.0008-0.114) with different
subsampling schemes and parameters levels. The lower
deviation
indicates the individual and interactions sensitivity calculated are
more accurate.
For
single-subsampling ANOVA and multiple-subsampling ANOVA approaches, the
corresponding deviations range from 0.024 to
0.114. As expected, significantly
better performances (the corresponding deviations range from 0.001 to
0.016) are obtained in full-subsampling ANOVA method. Moreover, the
deviations are lower than 0.002 if 3 or more parameter levels are chosen
in the full-subsampling ANOVA. Such deviations indicate that
biased/inaccurate sensitivity indices obtained through the
single/multiple-subsampling ANOVA methods. The negligible bias in
full-subsampling ANOVA method show that the parameters sensitivities are
very close to the “true value” when the subsampled parameter level is
3 or more. Therefore, in order to get more reliable parameter
sensitivity results, the full-subsampling scheme with 3 or more
parameter levels is necessary for the application of subsampling ANOVA
methods.
Many researches point that sobol’s method is computationally expensive
(Tang et al., 2008,
Tian, 2013,
Reusser et al., 2011). Here, to
illustrate the computational advantages of the subsampling ANOVA
methods, the number of model running and the number of calculations of
variance required by subsampling ANOVA methods and sobol’s are presented
in Table 1. Generally speaking, N*(M+2) model evaluations are required
for the application of sobol’s, where N is the random sample size and M
is the number of parameters, for more details about sobol’s method,
please refer to (Sobol’, 1990,
Nossent et al., 2011). In this case
study, in order to get a stable result of the sensitivity analysis,
different set of N samples are applied in the sobol’s. We found that the
sensitivity analysis remained relatively stable when N was larger than
2000. So in this simple three-parameter model, the number of running the
model is 2000*(3+2), which is a barely acceptable computing requirement.
Fortunately the subsampling ANOVA methods can significantly reduce the
calculation requirements while sobol’s calculation accuracy is achieved.
For example, in full-sampling ”444”, the model needs to run only 64
times (64=4*4*4). It should be noted that after running the model 64
times, the 64 sets of model responses can be obtained. Through
resampling process, 216 sets (216=, where ) of 2*2*2 combination can be
obtained, and each combination can calculate a set of variance results.
Thus, 216 sets of variance results can be obtained. The final
sensitivity results can be obtained by averaging and homogenizing the
216 sets of variance. The number of running the model decides the
computing requirements. Through reducing the number of model runs, the
subsampling ANOVA methods are effective and feasible sensitivity
analysis methods with relatively low computational requirements.
Reduction of model running times requirement is very important,
especially for those models with limited parameters but extensive
computational demand.