Preclinical and translational studies using state-of-the-art
animal models that are useful for elucidating the role of RBCBs in
various aspects of AUD and its treatment
Creating preclinical models is essential for elucidating the
reinforcement-based cognitive mechanisms that contribute to
addiction-related behaviours. However, the main difficulty with
modelling RBCBs using non-human animals is capturing these inherently
complex processes using relatively simple behavioural protocols. One of
the most promising preclinical tools that can be used for the
investigation of biased sensitivity to reinforcement in animal models is
mentioned in the previous sections, the PRL paradigm (Figure 4). In the
rat version of the PRL task, the animals are trained to press levers
(Rychlik, Bollen & Rygula, 2017; Rygula & Popik, 2016) or nose-poke
holes (Bari et al., 2010) in an operant conditioning chamber to receive
a food reward. Choosing one of them (the ‘correct’ lever or hole) is
associated with a high probability of receiving a reward (a drop of
sucrose solution or reward pellet) and a low probability of receiving a
punishment (mild electric foot shock or lack of reward and time out).
Conversely, choosing the other one (the ‘incorrect’ lever or hole) is
associated with a high probability of receiving the punishment and only
a small chance of receiving the reward. The probabilities of receiving
the reward and punishment upon choosing the ‘correct’ lever or hole are
usually set to 80% and 20%, respectively, and vice versa for the
‘incorrect’ lever or hole. The animals have to adjust their behaviour by
responding to the appropriate levers or holes to maximize reward and
minimize punishment delivery while disregarding occasional misleading
positive or negative feedback. After several consecutive choices of the
‘correct’ lever or hole, the reversal criterion is reached, and the
probabilities reverse; that is, the previously ‘correct’ lever or hole
becomes ‘incorrect’, and vice versa. Similar to humans, to assess the
sensitivity of the animals to positive and negative reinforcement,
behaviour is analysed according to the outcome of each preceding trial
and based on the WSLS analysis, where the proportion of win-stay
behaviours indexes sensitivity to positive reinforcement while the
proportion of lose-shifts indicates sensitivity to negative
reinforcement. In the aforementioned landmark study, using a PRL task
analogous to that used in humans, Bari and colleagues (2010)
demonstrated that different manipulations of 5-HT neurotransmission in
rats resulted in significant changes in reinforcement sensitivity. These
results were further complemented by a study in a non-human primate, the
marmoset, employing a preclinical and translational version of the PRL
task, which demonstrated that 5-HT depletions within the orbitofrontal
cortex and amygdala manifested impairments in overall reinforcement
sensitivity rates, including temporal learning from both rewarding and
negative outcomes (Rygula et al., 2015). A study by Rychlik and
collaborators (2017) demonstrated, using a preclinical version of the
PRL paradigm, that the non-monoaminergic compound ketamine selectively
down-regulates sensitivity to negative outcomes in animals. In this
study, acute treatment with ketamine significantly and persistently
decreased the ratio of lose-shift behaviours in a manner similar to that
observed following the administration of higher doses of the SSRI
citalopram (Bari et al., 2010). These results were recently supported by
a study by Wilkinson and collaborators (2020), who also observed an
interaction between ketamine treatment and feedback type with a trend
towards decreased negative feedback sensitivity in the preclinical PRL
paradigm. Ketamine was also reported to have robust positive effects on
the interpretation of ambiguous cues (Hales, Houghton & Robinson,
2017). Given the similarity in the effects of citalopram and ketamine,
it was hypothesized that the effects of the latter were mediated
indirectly by 5-HT neurotransmission (Rychlik, Bollen & Rygula, 2017).
Indeed, recent reports have demonstrated that in addition to affecting
glutamatergic neurotransmission via N-methyl-D-aspartate receptors
(NMDARs), ketamine also potentiates 5-HT release in the prefrontal
cortex via amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptors
(AMPARs) in the raphe nucleus (Nishitani et al., 2014). As demonstrated
by recent studies (Drozd, Rychlik, Fijalkowska & Rygula, 2019; Noworyta
& Rygula, 2021; Rygula & Popik, 2016), the preclinical version of the
PRL task may be used not only for investigating neuroanatomical or
neurochemical bases of reinforcement sensitivity but also for evaluating
how this sensitivity interacts with other cognitive processes and
pharmacological treatment outcomes. In a study by Rygula and Popik
(2016), the PRL task was employed to investigate how ’optimistic’ and
’pessimistic’ animals incorporated feedback (both rewarding and
punishing) in their decisions in a changing and uncertain environment.
The results of this study demonstrated the interrelation and
co-existence of two cognitive biases (the rats classified as
‘pessimistic’ were significantly more sensitive to negative feedback
than their ‘optimistic’ conspecifics) that may predict vulnerability to
various psychiatric disorders, including AUD. A study by
Noworyta-Sokolowska and colleagues (2019) demonstrated for the first
time in rodents that sensitivity to negative and positive reinforcement
could be considered a stable and enduring behavioural trait. It also
showed that these traits were independent of each other and that trait
sensitivity to positive reinforcement is associated with cognitive
flexibility. These results have been supported by computational
modelling (Noworyta-Sokolowska, Kozub, Jablonska, Rodriguez Parkitna,
Drozd & Rygula, 2019). Several preclinical studies in rats have
investigated the neuroanatomical background of biased sensitivity to
reinforcement. In 2014, Dalton and colleagues (2014), using a
preclinical version of the PRL paradigm, found that the nucleus
accumbens shell and core facilitate reward-seeking in a distinct yet
complementary manner, with the shell guiding response selection to the
actions more likely to yield the reward and the core simply promoting
the approach towards reward-associated stimuli. In 2016, the same group
demonstrated that inactivation of the medial orbitofrontal cortex
rendered animals less sensitive to either positive or negative feedback,
while lateral orbitofrontal cortex activity was implicated in behaviours
following violations of reward expectancies signalled by negative
feedback (Dalton, Wang, Phillips & Floresco, 2016). In the same study,
inactivation of the prelimbic cortex increased sensitivity to positive
feedback and reduced sensitivity to negative feedback. Since the animals
tended to select the recently rewarded choice more often, regardless of
whether the previous choice was correct or not, this surprising effect
has been proposed to result from a form of ‘reward myopia’ (Dalton,
Wang, Phillips & Floresco, 2016). In 2019, Drozd and colleagues (2019)
reported the reinforcement modulating effects of the antidepressant
drugs mirtazapine and agomelatine in the PRL task, and these results
were supported a year later in a study by Wilkinson and collaborators
(2020). Recent research using a preclinical version of the PRL paradigm
identified 4 phenotypes of sensitivity to negative and positive
reinforcement in rats and reported statistically significant differences
between the investigated phenotypes in the effects of acute treatment
with SSRI escitalopram on anxiety (Noworyta & Rygula, 2021). These
results demonstrated that trait sensitivity to reinforcement could have
important implications for the effectiveness of pharmacological
interventions, including in AUD. Last but not least, as mentioned above,
the fear of negative outcomes has a powerful influence on decisions
regarding drinking. A study by Clarke and colleagues (2014) in marmoset
monkeys demonstrated, using an approach-avoidance conflict task, that
inactivation of anterior orbitofrontal or ventrolateral prefrontal
cortices increased general negative bias in decision making via two
distinct cognitive mechanisms—elevated uncertainty and attentional
disruption, respectively. The differentiation of the component neural
mechanisms underlying punishment processing revealed in that study
provided important insight into the efficacy of cognitive-behavioural
therapy in AUD, which may be more successful in a patient poor at
predicting than in one deficient in attentional control.