Fig. 7. Observed (black) and simulated CMIP5 and CMIP6 SST anomalies
(relative to 1901-1950) for the North Atlantic (NA, left column), the
Global Tropics (GT, middle column), and the North Atlantic Relative
Index (NARI, right column) when forced with ALL (blue, top row), AA
(magenta, second row), NAT (brown/red, third row), and GHG (green,
bottom row). The CMIP6 MMMs are presented with solid curves while the
CMIP5 MMMs are presented with dotted curves. Both are surrounded by
shaded areas demarking the bootstrapping confidence interval. Panels (a)
and (c) additionally display a 20-year running mean of the sum of
simulated NA and NARI over the individual forcing simulations for CMIP6
(burgundy dashed curve) with associated bootstrapping confidence
interval (burgundy shaded area). Including NA in the sum makes little
difference. For NA and GT under AA and NAT (middle two rows and left two
columns), the orange curve displays detrended observations, calculated
by subtracting simulated GHG-forced SST (bottom row) from observations
in that ocean basin. The yellow shaded area is the confidence interval
when bootstrapping the MMM of CMIP6 piC simulations, and represents the
magnitude of noise in the CMIP6 MMMs. A horizontal black dashed line
marks 0 anomaly, which represents the average SST from 1901-1950. The y
labels show the number of institutions that were used for each subset of
forcing agents in CMIP6 (N, see Table S2), and the subplot titles
display the correlation (r) and sRMSE between the MMM and observations
for CMIP6.
Observed NARI (panel c, black) shows strong multi-decadal variability
throughout the century. In the ALL simulations (top row, blue), the
temporal evolution of NARI (c) matches the observations with some skill
(r=0.40, sRMSE = 0.92 for CMIP6), but fails to capture the full
magnitude of observed cooling in the 1970s and 80s or, more prominently,
any multi-decadal variability prior to 1960. Moreover, its GT and NA
components do not match very well either the observed, roughly linear
warming trend in GT (b), or the marked multi-decadal variability in NA
(a). In both CMIP5 and CMIP6 ALL simulations, the simulations of GT (b,
blue) are anomalously colder than observations between 1960 and 2000,
when simulated AA cooling (e, magenta) is the strongest and not yet
compensated by GHG warming (k, green), leading us to question whether
the match of simulated and observed NARI in this period happens due to
compensating errors. For NA, the match between observations and the
ALL-forced response is better in the later part of the record, but worse
in the first half. During the period prior to 1960, according to both
CMIP ensembles, GHG warming (j, green) masks AA cooling (d, magenta) to
produce a roughly constant temperature in the ALL simulations (a, blue).
The simulated cold episode in 1964 is due to the eruption of Agung in
1963 (g, brown and red), and it is only after the mid 1960’s that
increased GHG warming overtakes stagnating AA cooling to produce
pronounced warming in fairly good accord with observations. Much of the
observed variability in NA (a, black) thus does not seem to be a
response to external radiative forcing.
The AA forcing had appeared to explain observed low-frequency Sahel
precipitation variability in H20, but we now see that it might be the
right result for the wrong reason. AA (second row, magenta) produce
low-frequency NARI variability (f), but this simulated NARI is a poor
match to observations (f, r=0.10, sRMSE = 1.04 for CMIP5; r=0.07,
sRMSE=1.09 for CMIP6; a performance statistically worse than noise). The
difference between simulations and observations is even more stark in
NARI’s constituent ocean basins. We can attempt to compare AA-forced NA
and GT to an observed “GHG-residual” (that is, the observation minus
the GHG-forced MMM, presented in orange instead of black), which
represents our best estimate of the sum of observed oceanic IV and the
observed responses to aerosols. This index shows marked, roughly
stationary low-frequency variability in NA (d, orange), which contrasts
with a more monotonic behavior in the simulated NA index (magenta). In
particular, we note that the AA simulations display an especially steep
decline in NA SST between ~1940 and 1980, but monotonic
cooling throughout the century. Though legislation to curb pollution
reduced AA loading in the northern hemisphere after 1970
(Hirasawa et al. 2020), simulated NA
doesn’t warm at all before 2010. Overall, the effect of reducing AA
emissions in both CMIP ensembles is to halt the cooling of NA, not to
cause actual warming. This is consistent with estimates of the
hemispheric difference in total absorbed solar radiation in AA
simulations in CMIP6, which level off, but do not decrease, after 1970
(Menary et al. 2020).
Could internal SST variability (\(\overrightarrow{o}\)) explain the
difference between the simulated response to forcing and observations in
these ocean basins? In Figure 8, we present the mean PS of SST for piC
simulations from each CMIP6 model (colder than observed models are in
blue and warmer than observed models are in red). We compare these PS to
the PS for observed SST (solid black), the GHG-residual (dotted-dashed
black), and/or the ALL-residual (dotted black), avoiding time series
with dramatic trends (see subplot legends). Simulated IV in most of the
CMIP6 models used in this study does not match residual or observed
low-frequency variability in NA (a), GT (b), or NARI (c). In CMIP5, SSTs
are colder and IV at all frequencies is larger than in CMIP6, but no
model shows an increase in spectral power at low frequencies for any SST
index (not shown). There are, however, three CMIP6 models for which
low-frequency IV in NA is not inconsistent with model physics:
CNRM-ESM2-1 p1 (pink), IPSL-CM6A-LR p1 (blue), and CNRM-CM6-1 p1 (grey).
Certainly, either the simulated SST response to forcing, simulated
oceanic internal variability, or both, are not well represented in the
CMIP ensembles, and this is the primary reason that coupled CMIP
simulations cannot reproduce observed 20th century
Sahel rainfall.