Preclinical and translational studies using state-of-the-art animal models that are useful for elucidating the role of RBCBs in various aspects of AUD and its treatment
Creating preclinical models is essential for elucidating the reinforcement-based cognitive mechanisms that contribute to addiction-related behaviours. However, the main difficulty with modelling RBCBs using non-human animals is capturing these inherently complex processes using relatively simple behavioural protocols. One of the most promising preclinical tools that can be used for the investigation of biased sensitivity to reinforcement in animal models is mentioned in the previous sections, the PRL paradigm (Figure 4). In the rat version of the PRL task, the animals are trained to press levers (Rychlik, Bollen & Rygula, 2017; Rygula & Popik, 2016) or nose-poke holes (Bari et al., 2010) in an operant conditioning chamber to receive a food reward. Choosing one of them (the ‘correct’ lever or hole) is associated with a high probability of receiving a reward (a drop of sucrose solution or reward pellet) and a low probability of receiving a punishment (mild electric foot shock or lack of reward and time out). Conversely, choosing the other one (the ‘incorrect’ lever or hole) is associated with a high probability of receiving the punishment and only a small chance of receiving the reward. The probabilities of receiving the reward and punishment upon choosing the ‘correct’ lever or hole are usually set to 80% and 20%, respectively, and vice versa for the ‘incorrect’ lever or hole. The animals have to adjust their behaviour by responding to the appropriate levers or holes to maximize reward and minimize punishment delivery while disregarding occasional misleading positive or negative feedback. After several consecutive choices of the ‘correct’ lever or hole, the reversal criterion is reached, and the probabilities reverse; that is, the previously ‘correct’ lever or hole becomes ‘incorrect’, and vice versa. Similar to humans, to assess the sensitivity of the animals to positive and negative reinforcement, behaviour is analysed according to the outcome of each preceding trial and based on the WSLS analysis, where the proportion of win-stay behaviours indexes sensitivity to positive reinforcement while the proportion of lose-shifts indicates sensitivity to negative reinforcement. In the aforementioned landmark study, using a PRL task analogous to that used in humans, Bari and colleagues (2010) demonstrated that different manipulations of 5-HT neurotransmission in rats resulted in significant changes in reinforcement sensitivity. These results were further complemented by a study in a non-human primate, the marmoset, employing a preclinical and translational version of the PRL task, which demonstrated that 5-HT depletions within the orbitofrontal cortex and amygdala manifested impairments in overall reinforcement sensitivity rates, including temporal learning from both rewarding and negative outcomes (Rygula et al., 2015). A study by Rychlik and collaborators (2017) demonstrated, using a preclinical version of the PRL paradigm, that the non-monoaminergic compound ketamine selectively down-regulates sensitivity to negative outcomes in animals. In this study, acute treatment with ketamine significantly and persistently decreased the ratio of lose-shift behaviours in a manner similar to that observed following the administration of higher doses of the SSRI citalopram (Bari et al., 2010). These results were recently supported by a study by Wilkinson and collaborators (2020), who also observed an interaction between ketamine treatment and feedback type with a trend towards decreased negative feedback sensitivity in the preclinical PRL paradigm. Ketamine was also reported to have robust positive effects on the interpretation of ambiguous cues (Hales, Houghton & Robinson, 2017). Given the similarity in the effects of citalopram and ketamine, it was hypothesized that the effects of the latter were mediated indirectly by 5-HT neurotransmission (Rychlik, Bollen & Rygula, 2017). Indeed, recent reports have demonstrated that in addition to affecting glutamatergic neurotransmission via N-methyl-D-aspartate receptors (NMDARs), ketamine also potentiates 5-HT release in the prefrontal cortex via amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptors (AMPARs) in the raphe nucleus (Nishitani et al., 2014). As demonstrated by recent studies (Drozd, Rychlik, Fijalkowska & Rygula, 2019; Noworyta & Rygula, 2021; Rygula & Popik, 2016), the preclinical version of the PRL task may be used not only for investigating neuroanatomical or neurochemical bases of reinforcement sensitivity but also for evaluating how this sensitivity interacts with other cognitive processes and pharmacological treatment outcomes. In a study by Rygula and Popik (2016), the PRL task was employed to investigate how ’optimistic’ and ’pessimistic’ animals incorporated feedback (both rewarding and punishing) in their decisions in a changing and uncertain environment. The results of this study demonstrated the interrelation and co-existence of two cognitive biases (the rats classified as ‘pessimistic’ were significantly more sensitive to negative feedback than their ‘optimistic’ conspecifics) that may predict vulnerability to various psychiatric disorders, including AUD. A study by Noworyta-Sokolowska and colleagues (2019) demonstrated for the first time in rodents that sensitivity to negative and positive reinforcement could be considered a stable and enduring behavioural trait. It also showed that these traits were independent of each other and that trait sensitivity to positive reinforcement is associated with cognitive flexibility. These results have been supported by computational modelling (Noworyta-Sokolowska, Kozub, Jablonska, Rodriguez Parkitna, Drozd & Rygula, 2019). Several preclinical studies in rats have investigated the neuroanatomical background of biased sensitivity to reinforcement. In 2014, Dalton and colleagues (2014), using a preclinical version of the PRL paradigm, found that the nucleus accumbens shell and core facilitate reward-seeking in a distinct yet complementary manner, with the shell guiding response selection to the actions more likely to yield the reward and the core simply promoting the approach towards reward-associated stimuli. In 2016, the same group demonstrated that inactivation of the medial orbitofrontal cortex rendered animals less sensitive to either positive or negative feedback, while lateral orbitofrontal cortex activity was implicated in behaviours following violations of reward expectancies signalled by negative feedback (Dalton, Wang, Phillips & Floresco, 2016). In the same study, inactivation of the prelimbic cortex increased sensitivity to positive feedback and reduced sensitivity to negative feedback. Since the animals tended to select the recently rewarded choice more often, regardless of whether the previous choice was correct or not, this surprising effect has been proposed to result from a form of ‘reward myopia’ (Dalton, Wang, Phillips & Floresco, 2016). In 2019, Drozd and colleagues (2019) reported the reinforcement modulating effects of the antidepressant drugs mirtazapine and agomelatine in the PRL task, and these results were supported a year later in a study by Wilkinson and collaborators (2020). Recent research using a preclinical version of the PRL paradigm identified 4 phenotypes of sensitivity to negative and positive reinforcement in rats and reported statistically significant differences between the investigated phenotypes in the effects of acute treatment with SSRI escitalopram on anxiety (Noworyta & Rygula, 2021). These results demonstrated that trait sensitivity to reinforcement could have important implications for the effectiveness of pharmacological interventions, including in AUD. Last but not least, as mentioned above, the fear of negative outcomes has a powerful influence on decisions regarding drinking. A study by Clarke and colleagues (2014) in marmoset monkeys demonstrated, using an approach-avoidance conflict task, that inactivation of anterior orbitofrontal or ventrolateral prefrontal cortices increased general negative bias in decision making via two distinct cognitive mechanisms—elevated uncertainty and attentional disruption, respectively. The differentiation of the component neural mechanisms underlying punishment processing revealed in that study provided important insight into the efficacy of cognitive-behavioural therapy in AUD, which may be more successful in a patient poor at predicting than in one deficient in attentional control.