Neurochemical correlates of reinforcement-based cognitive biases as potential targets for pharmacological treatment of alcohol addiction
Influential neurocomputational models emphasize dopamine (DA) as a neurochemical correlate of reinforcement learning (Bromberg-Martin, Matsumoto & Hikosaka, 2010; Eisenegger et al., 2014; Frank, Seeberger & O’Reilly R, 2004; Samson, Frank & Fellous, 2010). Indeed, the role of DA in the mediation of positive reinforcement and resulting cognitive biases is hard to overestimate. In a classic study, Frank and colleagues (2004) demonstrated that lowered availability of DA (as in Parkinson’s patients off medication) was associated with worse learning from the positive than from the negative outcomes. Importantly, this effect could be reversed by pharmacological boosting of dopaminergic neurotransmission. As suggested by Frank (2005), the opposite effects of low and high DA availability reflected DA-induced shifts in the balance between the BAS/Go/direct and BIS/NoGo/indirect pathways of the basal ganglia with low DA shifting the balance towards the NoGo pathway (impairing learning from positive feedback/reward relative to learning from punishment) and high DA shifting the balance towards the Go pathway (improving learning from positive feedback/reward relative to learning from punishment). Several other studies reported impairments in learning from punishment by dopaminergic therapy (Bodi et al., 2009; Cools, Barker, Sahakian & Robbins, 2001; Cools, Frank, Gibbs, Miyakawa, Jagust & D’Esposito, 2009; Moustafa, Cohen, Sherman & Frank, 2008; Swainson, Rogers, Sahakian, Summers, Polkey & Robbins, 2000). In the study by Pessiglione and colleagues (2006), enhancement of DA activity by administration of L-3,4-dihydroxyphenylalanine (L-DOPA) increased the frequency with which subjects chose high-probability gain but not the frequency with which they chose low-probability loss. It has been proposed that pharmacological boosting of DA neurotransmission increases tonic levels of DA within the striatum, which in turn occludes DA dips, which signal negative feedback, differentially affecting reward versus punishment-based learning (Frank, 2005; Grace & Rosenkranz, 2002). The role of DA in regulating feedback sensitivity has been recently confirmed in a preclinical study by Grospe and collaborators (2018). In that study, 6-hydroxydopamine-induced dopaminergic lesions within the rat dorsomedial striatum resulted in significantly increased negative reinforcement sensitivity. Notably, the sensitivity to reinforcement and associated cognitive biases also seem to be regulated at the receptor level. The important role of dopamine D2 receptor gene polymorphism (DRD2-TAQ-IA) in reinforcement learning has been demonstrated by Klein and colleagues (2007) using a neuroimaging paradigm. In this study, in a probabilistic learning task, A1 allele carriers with reduced dopamine D2 receptor densities learned to avoid actions with negative consequences less efficiently than those without it. A1 subjects have also been shown to be impaired in the ability to sustain a newly rewarded response after a reversal and demonstrated a generally decreased tendency to stick with a rewarded response (Jocham, Klein, Neumann, von Cramon, Reuter & Ullsperger, 2009). In 2015, Cox and collaborators (2015), using positron emission tomography (PET) with two selective DA receptor radioligands ([11C]SCH23390 and [11C]raclopride), demonstrated that individual differences in dopaminergic D1 and D2 receptor binding determine the effectiveness of learning from positive and negative reinforcement, respectively. Studies in humans were complemented by a series of elegant studies using animal models. A study by Groman and collaborators (2016), using PET in rats performing a three-choice spatial PRL task, demonstrated a role for dopamine D3receptors in reinforcement learning. In that study, greater midbrain dopamine D3 receptor availability (indicated by [11C]-(+)-PHNO binding) was associated with a lower sensitivity to positive reinforcement, resulting in a lower rate of learning. The role of dopamine D3 receptors in sensitivity to reinforcement was further confirmed following administration of a dopamine D3\sout-receptor agonist, pramipexole, which impaired the performance of rats in a very similar way (Groman et al., 2016). These results suggest that in addition to dopamine D1/D2, dopamine D3receptor dysregulation may also underlie abnormal reinforcement sensitivity, and they implicate that these receptors may be a novel target for AUD treatment. In , Sharot and colleagues demonstrated that administration of L-DOPA during the imaginative construction of positive future life events subsequently enhanced estimates of the hedonic pleasure to be derived from these same events (Sharot, Shiner, Brown, Fan & Dolan, 2009). This study was supported by a report in 2012 (Sharot, Guitart-Masip, Korn, Chowdhury & Dolan, 2012), which revealed that administration of L-DOPA increased optimism bias by impairing the ability to update beliefs in response to undesirable information about the future. The latter converged with the above-mentioned observations from patients with Parkinson’s disease, where enhanced DA levels led to impaired learning from unwanted outcomes (Frank, Seeberger & O’Reilly R, 2004). These findings provided evidence that DA modulates subjective hedonic expectations and impacts belief formation by reducing negative expectations regarding the future.
Along with DA, serotonin (5-HT) is the second neurotransmitter crucially implicated in reinforcement learning. Published reports (Bari et al., 2010; Chamberlain, Muller, Blackwell, Clark, Robbins & Sahakian, 2006; Cools, Robinson & Sahakian, 2008; Fischer & Ullsperger, 2017; Rygula et al., 2015; Sachs, Rodriguiz, Tran, Iyer, Wetsel & Caron, 2015) have suggested that increasing 5-HT neurotransmission leads to a reduced sensitivity to aversive outcomes, whereas reducing 5-HT transmission, by way of either acute tryptophan depletion (ATD), a procedure that has been used extensively to study the effect of low 5-HT levels in the human brain, pre-synaptic receptor stimulation (acting to temporarily down-regulate 5-HT transmission), or up-regulation of the serotonin transporter (SERT), leads to an increased sensitivity to aversive outcomes. Indeed, a study by Chamberlain and colleagues (2006) demonstrated that a low, acute dose of the selective serotonin reuptake inhibitor (SSRI) citalopram, which has been postulated to affect pre-synaptic serotonin 5-HT1A autoreceptors, increases the sensitivity to negative feedback in the PRL task. Similar effects were reported in a study by Cools and collaborators (2008), where ATD enhanced the ability of subjects to predict punishment without affecting reward predictions. The results of the above-mentioned studies were complemented by the report by den Ouden and collaborators (2013), who studied the role of 5-HT (and DA) in reinforcement sensitivity as a function of two polymorphisms in the genes encoding the 5-HT and DA transporters (SERT: 5HTTLPR plus rs25531; DAT1 3′UTR VNTR). The results of this study revealed that allelic variation in SERT predicted behavioural adaptation following punishment. Specifically, L′ homozygosity, which has been linked with increased SERT binding and decreased levels of extracellular 5-HT (Willeit & Praschak-Rieder, 2010), was associated with increased negative reinforcement learning (den Ouden et al., 2013). The role of serotonin in RBCBs has been supported by ample evidence from research using animal models. In 2010, Bari and collaborators (2010) demonstrated that different manipulations of 5-HT neurotransmission in rats resulted in altered sensitivity to positive and negative reinforcement. In this study, acute administration of a high dose of SSRI citalopram decreased negative feedback sensitivity by lowering the ratio of lose-shift behaviours. In contrast, acute administration of a low dose of this drug, which was postulated to temporarily silence 5-HT system activity via inhibitory serotonin 5-HT1A autoreceptor activation in the raphe nuclei, similar to the above-mentioned study by Chamberlain and collaborators (2006), increased sensitivity to negative reinforcement. A similar effect was reported following global 5-HT depletion (Bari et al., 2010). In 2012, Ineichen and colleagues (2012) demonstrated in an automated two-choice operant spatial discrimination paradigm that genetic reduction in SERT function investigated in heterozygous mutant mice from a SERT null mutant strain led to a decreased sensitivity to negative feedback, which was an effect similar to that observed by Bari and collaborators (2010) following acute administration of the higher dose of citalopram. Both of the mentioned manipulations also caused increased ratios of win-stay behaviours, indicating increased sensitivity to positive reinforcement (Ineichen et al., 2012), an effect confirmed recently by Wilkinson and colleagues (2020). The results of the study by Rygula and collaborators (2014) revealed the important role of 5-HT in modulating cognitive judgement bias in rats. In that study, the SSRI citalopram at a low dose of 1 mg/kg significantly biased animals towards positive interpretation of the ambiguous cues, while at higher doses (5 and 10 mg/kg), the animals interpreted the ambiguous cues more negatively. Interestingly, a study from 2017 further demonstrated that the effects of acute 5-HT manipulations on the interpretation of ambiguity might depend on the basal valence of cognitive judgement bias (Golebiowska & Rygula, 2017). In that study, acute administration of escitalopram caused a ’pessimistic’ shift in the interpretation of ambiguous cues in animals classified as ‘optimistic’ and had no significant effects on those previously classified as ’pessimistic’.
Taken together, published reports suggest that increasing 5-HT transmission leads to a reduced sensitivity to aversive outcomes, whereas reducing 5-HT transmission, by way of either ATD, pre-synaptic receptor stimulation, or up-regulation of SERT, leads to an increased sensitivity to aversive outcomes. These results strongly suggest 5-HT as a potential target and serotonergic manipulations as effective treatment strategies in modulating alcohol-drinking outcome expectations.
To summarize this section, psychopharmacologic manipulation of DA and 5-HT neurotransmitter systems implicated in RBCBs may have the potential to provide insights into the development and maintenance of alcohol addiction and should be considered as targets for pharmacological treatment of AUD (Figure 3). Of particular interest is the potential to experimentally vary reinforcement sensitivity and outcome expectations and to do so in a way that builds a direct paradigmatic bridge with the relevant alcohol literature. Further challenges include identification of the specific interactions between neurochemical correlates of RBCBs and brain processes involved in alcohol addiction.