Neurochemical correlates of reinforcement-based cognitive biases
as potential targets for pharmacological treatment of alcohol addiction
Influential neurocomputational models emphasize dopamine (DA) as a
neurochemical correlate of reinforcement learning (Bromberg-Martin,
Matsumoto & Hikosaka, 2010; Eisenegger et al., 2014; Frank, Seeberger
& O’Reilly R, 2004; Samson, Frank & Fellous, 2010). Indeed, the role
of DA in the mediation of positive reinforcement and resulting cognitive
biases is hard to overestimate. In a classic study, Frank and colleagues
(2004) demonstrated that lowered availability of DA (as in Parkinson’s
patients off medication) was associated with worse learning from the
positive than from the negative outcomes. Importantly, this effect could
be reversed by pharmacological boosting of dopaminergic
neurotransmission. As suggested by Frank (2005), the opposite effects of
low and high DA availability reflected DA-induced shifts in the balance
between the BAS/Go/direct and BIS/NoGo/indirect pathways of the basal
ganglia with low DA shifting the balance towards the NoGo pathway
(impairing learning from positive feedback/reward relative to learning
from punishment) and high DA shifting the balance towards the Go pathway
(improving learning from positive feedback/reward relative to learning
from punishment). Several other studies reported impairments in learning
from punishment by dopaminergic therapy (Bodi et al., 2009; Cools,
Barker, Sahakian & Robbins, 2001; Cools, Frank, Gibbs, Miyakawa, Jagust
& D’Esposito, 2009; Moustafa, Cohen, Sherman & Frank, 2008; Swainson,
Rogers, Sahakian, Summers, Polkey & Robbins, 2000). In the study by
Pessiglione and colleagues (2006), enhancement of DA activity by
administration of L-3,4-dihydroxyphenylalanine (L-DOPA) increased the
frequency with which subjects chose high-probability gain but not the
frequency with which they chose low-probability loss. It has been
proposed that pharmacological boosting of DA neurotransmission increases
tonic levels of DA within the striatum, which in turn occludes DA dips,
which signal negative feedback, differentially affecting reward versus
punishment-based learning (Frank, 2005; Grace & Rosenkranz, 2002). The
role of DA in regulating feedback sensitivity has been recently
confirmed in a preclinical study by Grospe and collaborators (2018). In
that study, 6-hydroxydopamine-induced dopaminergic lesions within the
rat dorsomedial striatum resulted in significantly increased negative
reinforcement sensitivity. Notably, the sensitivity to reinforcement and
associated cognitive biases also seem to be regulated at the receptor
level. The important role of dopamine D2 receptor gene
polymorphism (DRD2-TAQ-IA) in reinforcement learning has been
demonstrated by Klein and colleagues (2007) using a neuroimaging
paradigm. In this study, in a probabilistic learning task, A1 allele
carriers with reduced dopamine D2 receptor densities
learned to avoid actions with
negative consequences less efficiently than those without it. A1
subjects have also been shown to be impaired in the ability to sustain a
newly rewarded response after a reversal and demonstrated a generally
decreased tendency to stick with a rewarded response (Jocham, Klein,
Neumann, von Cramon, Reuter & Ullsperger, 2009). In 2015, Cox and
collaborators (2015), using positron emission tomography (PET) with two
selective DA receptor radioligands ([11C]SCH23390 and
[11C]raclopride), demonstrated that individual differences in
dopaminergic D1 and D2 receptor binding
determine the effectiveness of learning from positive and negative
reinforcement, respectively. Studies in humans were complemented by a
series of elegant studies using animal models. A study by Groman and
collaborators (2016), using PET in rats performing a three-choice
spatial PRL task, demonstrated a role for dopamine D3receptors in reinforcement learning. In that study, greater midbrain
dopamine D3 receptor availability (indicated by
[11C]-(+)-PHNO binding) was associated with a lower sensitivity to
positive reinforcement, resulting in a lower rate of learning. The role
of dopamine D3 receptors in sensitivity to reinforcement
was further confirmed following administration of a dopamine
D3\sout-receptor agonist, pramipexole, which impaired
the performance of rats in a very similar way (Groman et al., 2016).
These results suggest that in addition to dopamine
D1/D2, dopamine D3receptor dysregulation may also underlie abnormal reinforcement
sensitivity, and they implicate that these receptors may be a novel
target for AUD treatment. In , Sharot and colleagues demonstrated that
administration of L-DOPA during the imaginative construction of positive
future life events subsequently enhanced estimates of the hedonic
pleasure to be derived from these same events (Sharot, Shiner, Brown,
Fan & Dolan, 2009). This study was supported by a report in 2012
(Sharot, Guitart-Masip, Korn, Chowdhury & Dolan, 2012), which revealed
that administration of L-DOPA increased optimism bias by impairing the
ability to update beliefs in response to undesirable information about
the future. The latter converged with the above-mentioned observations
from patients with Parkinson’s disease, where enhanced DA levels led to
impaired learning from unwanted outcomes (Frank, Seeberger & O’Reilly
R, 2004). These findings provided evidence that DA modulates subjective
hedonic expectations and impacts belief formation by reducing negative
expectations regarding the future.
Along with DA, serotonin (5-HT) is the second neurotransmitter crucially
implicated in reinforcement learning. Published reports (Bari et al.,
2010; Chamberlain, Muller, Blackwell, Clark, Robbins & Sahakian, 2006;
Cools, Robinson & Sahakian, 2008; Fischer & Ullsperger, 2017; Rygula
et al., 2015; Sachs, Rodriguiz, Tran, Iyer, Wetsel & Caron, 2015) have
suggested that increasing 5-HT neurotransmission leads to a reduced
sensitivity to aversive outcomes, whereas reducing 5-HT transmission, by
way of either acute tryptophan depletion (ATD), a procedure that has
been used extensively to study the effect of low 5-HT levels in the
human brain, pre-synaptic receptor stimulation (acting to temporarily
down-regulate 5-HT transmission), or up-regulation of the serotonin
transporter (SERT), leads to an increased sensitivity to aversive
outcomes. Indeed, a study by Chamberlain and colleagues (2006)
demonstrated that a low, acute dose of the selective serotonin reuptake
inhibitor (SSRI) citalopram, which has been postulated to affect
pre-synaptic serotonin 5-HT1A autoreceptors, increases
the sensitivity to negative feedback in the PRL task. Similar effects
were reported in a study by Cools and collaborators (2008), where ATD
enhanced the ability of subjects to predict punishment without affecting
reward predictions. The results of the above-mentioned studies were
complemented by the report by den Ouden and collaborators (2013), who
studied the role of 5-HT (and DA) in reinforcement sensitivity as a
function of two polymorphisms in the genes encoding the 5-HT and DA
transporters (SERT: 5HTTLPR plus rs25531; DAT1 3′UTR VNTR). The results
of this study revealed that allelic variation in SERT predicted
behavioural adaptation following punishment. Specifically, L′
homozygosity, which has been linked with increased SERT binding and
decreased levels of extracellular 5-HT (Willeit & Praschak-Rieder,
2010), was associated with increased negative reinforcement learning
(den Ouden et al., 2013). The role of serotonin in RBCBs has been
supported by ample evidence from research using animal models. In 2010,
Bari and collaborators (2010) demonstrated that different manipulations
of 5-HT neurotransmission in rats resulted in altered sensitivity to
positive and negative reinforcement. In this study, acute administration
of a high dose of SSRI citalopram decreased negative feedback
sensitivity by lowering the ratio of lose-shift behaviours. In contrast,
acute administration of a low dose of this drug, which was postulated to
temporarily silence 5-HT system activity via inhibitory serotonin
5-HT1A autoreceptor activation in the raphe nuclei,
similar to the above-mentioned study by Chamberlain and collaborators
(2006), increased sensitivity to negative reinforcement. A similar
effect was reported following global 5-HT depletion (Bari et al., 2010).
In 2012, Ineichen and colleagues (2012) demonstrated in an automated
two-choice operant spatial discrimination paradigm that genetic
reduction in SERT function investigated in heterozygous mutant mice from
a SERT null mutant strain led to a decreased sensitivity to negative
feedback, which was an effect similar to that observed by Bari and
collaborators (2010) following acute administration of the higher dose
of citalopram. Both of the mentioned manipulations also caused increased
ratios of win-stay behaviours, indicating increased sensitivity to
positive reinforcement (Ineichen et al., 2012), an effect confirmed
recently by Wilkinson and colleagues (2020). The results of the study by
Rygula and collaborators (2014) revealed the important role of 5-HT in
modulating cognitive judgement bias in rats. In that study, the SSRI
citalopram at a low dose of 1 mg/kg significantly biased animals towards
positive interpretation of the ambiguous cues, while at higher doses (5
and 10 mg/kg), the animals interpreted the ambiguous cues more
negatively. Interestingly, a study from 2017 further demonstrated that
the effects of acute 5-HT manipulations on the interpretation of
ambiguity might depend on the basal valence of cognitive judgement bias
(Golebiowska & Rygula, 2017). In that study, acute administration of
escitalopram caused a ’pessimistic’ shift in the interpretation of
ambiguous cues in animals classified as ‘optimistic’ and had no
significant effects on those previously classified as ’pessimistic’.
Taken together, published reports suggest that increasing 5-HT
transmission leads to a reduced sensitivity to aversive outcomes,
whereas reducing 5-HT transmission, by way of either ATD, pre-synaptic
receptor stimulation, or up-regulation of SERT, leads to an increased
sensitivity to aversive outcomes. These results strongly suggest 5-HT as
a potential target and serotonergic manipulations as effective treatment
strategies in modulating alcohol-drinking outcome expectations.
To summarize this section, psychopharmacologic manipulation of DA and
5-HT neurotransmitter systems implicated in RBCBs may have the potential
to provide insights into the development and maintenance of alcohol
addiction and should be considered as targets for pharmacological
treatment of AUD (Figure 3). Of particular interest is the potential to
experimentally vary reinforcement sensitivity and outcome expectations
and to do so in a way that builds a direct paradigmatic bridge with the
relevant alcohol literature. Further challenges include identification
of the specific interactions between neurochemical correlates of RBCBs
and brain processes involved in alcohol addiction.