A, Overall task structure. Participants completed trials of the same valence (reward learning or loss learning) with stimuli remaining consistent throughout a block. Once participants had learned stimulus contingencies for a block, a new block began with different stimuli and the other valence (loss or reward). The task ended when participants had at least 25 correct trials and 50 total trials for both reward learning and loss learning (median number of trials completed, 50). B, Schematic depiction of reward learning and loss learning trials. On each trial, participants were presented with 2 abstract stimuli. After choosing a stimulus, the chosen option was highlighted for a brief period and then an outcome (monetary reward or loss) was shown. Participants learned which option led to the better (75% probability of high reward or low loss) outcome. The task involved blocks consisting of trials with all reward (top) and all loss (bottom) outcomes. C, Reward learning and loss learning performance by symptom severity among all 101 participants. Performance was quantified as proportion of choices that were the better option. Over time, participants showed learning (running mean over 3 trials; averaged over all blocks by valence). Top panels comprise reward learning blocks while bottom panels comprise loss learning blocks. Behavior is separated by anhedonia, negative affect, and anxious arousal symptom severity, with participants with symptoms in the lowest tercile marked by a solid line, the middle tercile by a dashed line, and the highest tercile with a dotted line. The lines indicate mean scores, and the shaded areas indicate SEs.
A, Behaviorally, in reward learning in participants with depression, greater anhedonia is associated with lower learning rate and higher outcome sensitivity. Violin plots indicate posterior distribution of the association between the behavioral learning parameter and symptom measure. Anhedonia refers to the Mood and Anxiety Symptom Questionnaire anhedonia scale, negative affect to the Mood and Anxiety Symptom Questionnaire general distress scale, and anxious arousal to the Mood and Anxiety Symptom Questionnaire anxious arousal scale. Posterior distributions are from regressions testing each scale separately for associations with parameters; results were similar when testing all scales in the same analysis. B, Neurally, across all participants, anhedonia moderated the association between striatal prediction error (PE) responses and expected value. Low, medium, and high anhedonia were based on tercile split for the right striatum region of interest activation to PE (x-axis) and expected value (y-axis). Lines indicate regression lines for each group to illustrate moderation of the PE to expected value association by anhedonia. C, Behaviorally, in loss learning across all participants, greater negative affect was associated with more negative outcome shift. Violin plots indicate posterior distribution of the association between behavioral learning parameter and symptom measure. Posterior distributions are from regressions testing each scale separately for associations with parameters; results were similar when testing all scales in the same analysis. D, Neurally, negative affect was negatively associated with subgenual anterior cingulate cortex (sgACC) signaling of PE at the time of outcome receipt. Dots indicate individual participants’ negative affect severity vs sgACC region of interest activation to PE; the regression line indicates the overall negative association.
aP < .05.
A, Changes in symptoms from pretreatment to posttreatment for 28 patients. Total symptoms (sum of Mood and Anxiety Symptom Questionnaire anhedonia, general distress, and anxious arousal scales), anhedonia (Mood and Anxiety Symptom Questionnaire anhedonia scale), negative affect (Mood and Anxiety Symptom Questionnaire general distress scale), and anxious arousal (Mood and Anxiety Symptom Questionnaire anxious arousal scale) all decreased on average from pretreatment to posttreatment with cognitive behavioral therapy, but with large heterogeneity in treatment response across participants. Individual lines of symptom change are colored by the degree of overall symptom improvement (green: high improvement; orange: low improvement) to illustrate consistent rates of improvement across individual subscales. B, Association between changes in reinforcement learning parameters and symptom changes with cognitive behavioral therapy. Increases in learning rate were correlated with symptom improvement for reward learning, and increases in outcome shift were correlated with symptom improvement for loss learning. Violin plots indicate posterior distribution of the association between changes in behavioral learning parameter and percentage change in symptoms.
eFigure 1. Association of overall performance with MDD diagnosis and symptom severity.
eFigure 2. Model fit and parameter recovery.
eFigure 3. Posterior distributions and individual means of parameters.
eFigure 4. Association of model parameters with model-agnostic summaries of behavior.
eFigure 5. Association between anhedonia and reward learning parameters by diagnosis.
eFigure 6. Behavioral performance and neural reward signals by depression status and overall depression severity.
eFigure 7. Differences in processing of loss outcomes by negative affect.
eFigure 8. Stability of parameter estimates over time for control participants.
eFigure 9. Parameter changes with time for participants with depression.
eFigure10. Diagram of flow of participants through study, including optional CBT portion.
eFigure 11. Neural responses associated with symptom improvement.
eFigure 12. Schematic depiction of effects of outcome sensitivity and outcome shift on valuation.
eTable 1. Reward prediction error, MDD group (n = 69).
eTable 2. Reward expected value, MDD group (n = 69).
eTable 3. Reward prediction error, controls without depression (n = 32).
eTable 4. Reward expected value, controls without depression (n = 32).
eTable 5. Loss outcome value correlated with negative affect (MASQ Mixed Distress subscale; n = 101).
eTable 6. Loss outcome value, low negative affect participants (n = 52).
eTable 7. Loss outcome value, high negative affect participants (n = 49).
eTable 8. Depression diagnosis, specifier, severity, medications, and comorbid diagnoses for participants.
eTable 9. Exploratory follow-up analyses of associations between symptom change and reinforcement learning parameters in participants with depression who completed CBT.
Customize your JAMA Network experience by selecting one or more topics from the list below.
Identify all potential conflicts of interest that might be relevant to your comment.
Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.
Err on the side of full disclosure.
If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.
Not all submitted comments are published. Please see our commenting policy for details.
Brown VM, Zhu L, Solway A, et al. Reinforcement Learning Disruptions in Individuals With Depression and Sensitivity to Symptom Change Following Cognitive Behavioral Therapy. JAMA Psychiatry. 2021;78(10):1113–1122. doi:10.1001/jamapsychiatry.2021.1844
Are depression symptoms associated with features of reinforcement learning, and if so, is treatment-related symptom change associated with learning changes?
In this mixed cross-sectional–cohort study including 101 participants, participants with and without depression completed a probabilistic learning task during functional magnetic resonance imaging; participants with depression were reassessed after cognitive behavioral therapy (CBT). Computational model–based analyses of behavioral choices and neural data identified associations of learning with symptoms during reward learning and loss learning, respectively; symptom improvement following CBT was associated with normalization of learning parameters.
Mapping reinforcement learning processes to symptoms of depression reveals mechanistic features of these symptoms and points to possible learning-based therapeutic processes and targets.
Major depressive disorder is prevalent and impairing. Parsing neurocomputational substrates of reinforcement learning in individuals with depression may facilitate a mechanistic understanding of the disorder and suggest new cognitive therapeutic targets.
To determine associations among computational model–derived reinforcement learning parameters, depression symptoms, and symptom changes after treatment.
Design, Setting, and Participants
In this mixed cross-sectional–cohort study, individuals performed reward and loss variants of a probabilistic learning task during functional magnetic resonance imaging at baseline and follow-up. A volunteer sample with and without a depression diagnosis was recruited from the community. Participants were assessed from July 2011 to February 2017, and data were analyzed from May 2017 to May 2021.
Main Outcomes and Measures
Computational model–based analyses of participants’ choices assessed a priori hypotheses about associations between components of reward-based and loss-based learning with depression symptoms. Changes in both learning parameters and symptoms were then assessed in a subset of participants who received cognitive behavioral therapy (CBT).
Of 101 included adults, 69 (68.3%) were female, and the mean (SD) age was 34.4 (11.2) years. A total of 69 participants with a depression diagnosis and 32 participants without a depression diagnosis were included at baseline; 48 participants (28 with depression who received CBT and 20 without depression) were included at follow-up (mean [SD] of 115.1 [15.6] days). Computational model–based analyses of behavioral choices and neural data identified associations of learning with symptoms during reward learning and loss learning, respectively. During reward learning only, anhedonia (and not negative affect or arousal) was associated with model-derived learning parameters (learning rate: posterior mean regression β = −0.14; 95% credible interval [CrI], −0.12 to −0.03; outcome sensitivity: posterior mean regression β = 0.18; 95% CrI, 0.02 to 0.37) and neural learning signals (moderation of association between striatal prediction error and expected value signals: t97 = −2.10; P = .04). During loss learning only, negative affect (and not anhedonia or arousal) was associated with learning parameters (outcome shift: posterior mean regression β = −0.11; 95% CrI, −0.20 to −0.01) and disrupted neural encoding of learning signals (association with subgenual anterior cingulate prediction error signals: r = −0.28; P = .005). Symptom improvement following CBT was associated with normalization of learning parameters that were disrupted at baseline (reward learning rate: posterior mean regression β = 0.15; 90% CrI, 0.001 to 0.41; loss outcome shift: posterior mean regression β = 0.42; 90% CrI, 0.09 to 0.77).
Conclusions and Relevance
In this study, the mapping of reinforcement learning components to symptoms of major depression revealed mechanistic features associated with these symptoms and points to possible learning-based therapeutic processes and targets.
Major depressive disorder affects approximately 7% of people in the US each year1 and is among the highest causes of disability in the world.2 However, characterizing and treating major depressive disorder is hampered by significant symptom heterogeneity.3 Recent paradigms4 suggest that moving beyond diagnostic status to focus on associations of major depression’s central impairments of anhedonia and negative affect5 with neurocomputational substrates of reinforcement learning6,7 may more precisely identify disrupted processes in individuals with depression and novel treatment targets. To that end, we sought to investigate the association of computational model–derived learning impairments with canonical depression symptoms and tested the translational relevance of these impairments by examining their responsiveness to symptom change after cognitive behavioral therapy (CBT).
According to computational formalizations of reinforcement learning, expectations about the outcomes of choices are updated based on prediction errors.7-9 This framework separates learning into computationally derived components associated with behaviorally and neurobiologically distinguishable processes (eg, outcome valuation vs expectation updating10,11). Computational model–based analyses differentiate and quantify these learning processes as model parameters that may then be associated with symptoms at the individual level. For depression, this approach has the potential to identify sources of disrupted responsivity to rewards and losses, including altered value updating following feedback, relative valuation of positive or negative feedback, and changes in overall valuation of outcomes.6
Such learning processes may be affected by both stimulus valence (eg, learning from rewards vs losses) and depression symptoms.6,12-21 With regard to the canonical symptoms of depression, anhedonia (ie, reduced experience of pleasure) affects reward learning more than depression as a whole,10,22-25 while negative affect, characterized by subjective distress and negative cognitions, may be associated with altered loss and error processing.12,13,18,20 This link between symptom clusters and neurobehavioral alterations is consistent with other work showing symptom, not diagnosis, effects.26-28 Initial findings combining these literatures13,17,18,29 suggest valence-dependent roles of learning anomalies in depression, but to our knowledge, no study has fully examined which reward learning and loss learning processes are associated with the core depressive symptoms of anhedonia and negative affect.
Demonstrating sensitivity to symptom change is critical to establishing the translational relevance of biobehavioral markers of psychiatric illness.30,31 Some evidence suggests that successful depression treatment may normalize reward responses and, in youth, reduce overresponsivity to punishments,32-34 but how these changes are associated with baseline impairments and whether they map onto mechanistic learning processes are unclear. To address these issues, we examined participants with depression who engaged in CBT, an efficacious psychotherapy theorized to reduce symptoms in part through changing learning,35-37 and tested whether symptom improvement following CBT was associated with changes in learning components. Given previous work indicating correlated decreases in all symptom measures after CBT,38,39 these analyses focused on learning parameter changes associated with overall symptom change rather than specific symptom subscales following CBT.
To summarize, we examined participants with and without a depression diagnosis performing reward and loss variants of a learning task while undergoing functional magnetic resonance imaging; a subset of the participants with depression was retested after completing CBT. We hypothesized that distinct processes in reward and loss learning, captured by computational model–derived parameters measuring aspects of updating and valuation and their corresponding neural signals, would be associated with symptoms of anhedonia and negative affect, respectively. Moreover, we posited that changes in these reward and loss learning parameters would be correlated with symptom improvement after treatment.
A total of 101 participants were recruited via community advertisements from southwest Virginia and Houston, Texas. The Baylor College of Medicine and Virginia Tech institutional review boards approved study procedures, and all participants provided written informed consent after receiving a complete description of the study. A total of 69 participants with depression had a primary DSM-IV40 diagnosis of major depressive disorder or dysthymia, assessed with the Structured Clinical Interview for DSM-IV41; 32 nonpsychiatric control participants had no history of any DSM disorder. Participants completed a battery of measures, including the Mood and Anxiety Symptom Questionnaire (MASQ),42 a validated self-report measure of symptom clusters of anhedonia (anhedonic depression subscale), negative affect (general distress subscale), and arousal (anxious arousal subscale), which were the primary symptoms of interest, as well as the Beck Depression Inventory-II43 to assess overall depression severity, the Wechsler Test of Adult Reading44 to estimate verbal IQ, and a demographics questionnaire. The eMethods and eTable 8 in the Supplement contains further details about participants and measures. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline.
Participants completed reward and loss variants of a probabilistic operant learning task (Figure 1A) with the goal of learning which of 2 options was more likely to lead to a higher outcome (larger reward in reward learning blocks or smaller loss in loss learning blocks; Figure 1B)45 while undergoing functional magnetic resonance imaging. The task was presented in pseudorandomized blocks of trials consisting of all reward outcomes or all loss outcomes (learning curves by baseline symptoms shown in Figure 1C). The eMethods in the Supplement contains further task design details.
Model-based analyses used reinforcement learning models to test hypotheses about potential sources of learning disruptions in participants with depression (updating, relative valuation of more negative or positive outcomes, and overall valuation changes6,10,46). The best-fitting model separated learning by valence (reward vs loss) and included 3 free parameters for both reward learning and loss learning: learning rate (α), which indexed the degree of updating based on prediction error; outcome sensitivity (ρ), which multiplicatively scaled more extreme outcome values, resulting in differential valuation of large vs small outcome values; and outcome shift (τ), which linearly shifted all outcome values, resulting in an overall positive or negative valuation bias. Model validation showed that parameters could be independently estimated, were associated with model-agnostic behavior, had good split-half and test-retest reliability, and were stable over time in participants without depression (eMethods and eFigures 2 to 4 and 12 in the Supplement).
Parametric regressors of interest in first-level imaging analyses used prediction error δt at outcome and chosen expected value Qt at stimulus onset. Analyses focused on meta-analytically–defined47 regions of interest (ROIs) in ventral striatum and ventromedial prefrontal cortex/subgenual anterior cingulate cortex, primary brain regions implicated in reinforcement learning.48 Functional magnetic resonance imaging data collection, preprocessing, and further analysis information are contained in the eMethods in the Supplement.
A total of 28 participants with depression who elected to engage in a 12-week course of standard, manual-guided CBT49 were assessed following CBT completion with the same procedures described above (eFigure 10 in the Supplement for a diagram of participant flow). These analyses assessed associations between symptom improvement, well-established with CBT,35,50 and learning parameters. Further details on CBT, CBT-related analyses, and symptom-independent parameter change (including stability of parameters in 20 nonpsychiatric control participants) can be found in eMethods and eFigures 8 and 9 in the Supplement.
Associations between symptoms and learning parameters were estimated within models fit to participants’ choices. Models were fit using hierarchical Bayesian estimation51 with data from all participants in the pertinent analysis; significant associations between symptoms and learning parameters were defined as a 95% posterior credible interval (CrI) of the regression coefficient for the association of symptom with learning parameter excluding 0, analogous to a frequentist α of .05. The posterior mean, the posterior mean divided by the posterior standard deviation (approximate standardized regression β), and 95% CrIs are reported. To control for false-positive rates, all associations were assessed at baseline (and similarly for treatment analyses) using hierarchical modeling52; additional Bayesian error control53 approaches restricted experimentwise error to 5%. Simulation-based power analyses, assuming 80% power, indicated we were powered to detect small effect sizes in regression analyses between symptom severity measures and learning parameters involving all 101 participants and medium effect sizes in analyses of 69 participants with depression only. See the eMethods in the Supplement for modeling details.
To test the association of neural activation with symptom measures, regressor-related blood oxygenation level–dependent activity (ROI values) was correlated with symptom measures using linear regressions and Pearson correlations. Following previous literature, for reward learning, the moderation of expected value and prediction error–related activity by symptoms was evaluated by testing the interaction of prediction error neural signal and symptoms on expected value neural signal in striatal ROIs.25 Frequentist analyses used an α level of .05 from 2-tailed tests.
Effects of symptom change following CBT (time × symptom analysis) were assessed as the association between changes in learning parameters and symptom changes (improvement pretreatment to posttreatment, similar to a mixed-effects analysis with 2 time points) in the participants with depression, controlling for initial symptom severity. Like baseline analyses, the CrI (set to 90% for significance to reflect directionality of hypotheses), the mean value of this association, and the standardized mean value as a measure of effect size are reported. As symptom subscale changes with CBT are typically highly intercorrelated,38,39 primary analyses focused on changes in learning parameters against changes in overall symptoms; significant associations were then examined with exploratory analyses within anhedonia, negative affect, and arousal subscales. Frequentist analyses involving symptom change used an α level of .05 and 1-tailed tests based on directional hypotheses. Analyses were carried out in R version 3.6.0 (The R Foundation) and Stan version 2.19 (using the rstan package in R).
Of 101 included adults, 69 (68.3%) were female, and the mean (SD) age was 34.4 (11.2) years. A total of 69 participants with a depression diagnosis and 32 participants without a depression diagnosis were included at baseline; 48 participants (28 with depression who received CBT and 20 without depression) were included at follow-up (mean [SD] of 115.1 [15.6] days). Clinical and demographic data are reported in Table 1. As expected, participants with a depression diagnosis had elevated symptoms but did not differ from participants without depression on estimated IQ, age, or self-reported gender.
We tested associations of computational model–derived learning parameters (learning rate, outcome sensitivity, and outcome shift) with symptom severity (MASQ subscales of anhedonia, negative affect, and anxious arousal, tested in separate regressions) during reward and loss learning (Table 2). Follow-up analyses simultaneously assessed all 3 MASQ subscales in the same analysis and tested depression diagnosis and Beck Depression Inventory-II score as a measure of overall depression severity (eResults in the Supplement). Analyses were carried out across all participants and within participants with a diagnosis of depression only.
During reward learning, in participants with depression, greater anhedonia was associated with reduced learning rate, indicating slower updating of reward values with increased anhedonia and greater outcome sensitivity parameter values (learning rate: posterior mean regression β = −0.14; 95% CrI, −0.12 to −0.03; outcome sensitivity: posterior mean regression β = 0.18; 95% CrI, 0.02 to 0.37; Figure 2A). These associations were apparent with anhedonia and absent with negative affect, arousal, or depression diagnosis. No measures were associated with changes in the outcome shift parameter. Results were similar when assessing all MASQ scales in the same analysis (eResults in the Supplement). When including participants without depression, these associations were not present (association of anhedonia and learning rate: mean, −0.05; 95% CrI, −0.01 to 0.07; standardized mean, 0.95; outcome sensitivity: mean, −0.03; 95% CrI, −0.15 to 0.09; standardized mean, −0.50). Model-agnostic results (eResults and eFigures 1 and 5 in the Supplement) were also consistent with an association between reward learning and anhedonia in participants with clinical levels of depression.
Reward prediction error and expected value signals did not vary with any symptom measure or depression diagnosis (meta-analytically defined striatal or ventromedial prefrontal cortex ROIs or exploratory whole-brain analysis; eFigure 6 and eTables 1 to 4 in the Supplement). Following previous work, to assess if anhedonia disrupted associations among otherwise intact striatal signals,25,54 we investigated associations between prediction error (at outcome) and expected value (at choice) in ventral striatum. In line with this previous work, anhedonia moderated the association between expected value and prediction error signals (interaction: t97 = −2.10; P = .04; Figure 2B), which was not associated with learning rate differences.
During loss learning, a different pattern of associations emerged such that negative affect severity was associated with more negative outcome shift parameter values, indicating more negative valuation of losses (outcome shift: posterior mean regression β = −0.11; 95% CrI, −0.20 to −0.01; Figure 2C). This association was present in all participants, regardless of depression diagnosis, and was not observed with anhedonia, arousal, or depression diagnosis. Associations were similar when assessing all MASQ scales simultaneously. No symptom subscales were associated with loss learning rate or outcome sensitivity parameters.
Prediction error activity in the subgenual anterior cingulate cortex ROI was negatively associated with negative affect (r = −0.28; P = .005; Figure 2D), with no differences in striatal activity or expected value signals. Exploratory follow-up analyses (eMethods, eFigure 7, and eTables 5 to 7 in the Supplement) suggested reduced subgenual anterior cingulate cortex representation of outcome value in participants with high negative affect drove this association. Expected value and the association between expected value and prediction error were unrelated to symptom measures during loss learning.
The association of baseline symptoms with learning parameters suggested the translational potential of reinforcement learning processes beyond descriptive characterizations of depression. We thus sought to assess whether these altered learning processes were associated with symptom changes following CBT in participants with depression.
As expected, after CBT, participants showed large mean decreases in all symptoms (Figure 3A). Consistent with the literature,55 participants showed heterogenous degrees of change, enabling investigation of individual differences in symptom change (Figure 3A). As within-participant changes in symptom measures were highly correlated (eg, correlation between change in anhedonia and negative affect: r = 0.62; P < .001), analyses focused on overall improvement (summing anhedonia + negative affect + arousal scales; if significant, exploratory analyses, reported in eTable 9 in the Supplement, focused on individual subscale change) as associated with changes in learning parameters (outcome sensitivity, outcome shift, and learning rate during reward learning and loss learning).
As described above, at baseline, reward learning rate was negatively correlated and reward outcome sensitivity positively correlated with anhedonia. Increases in reward learning rate following CBT were significantly associated with overall symptom improvement, including improved anhedonia (reward learning rate: posterior mean regression β = 0.15; 90% CrI, 0.001 to 0.41; Figure 3B; Table 2). Changes in reward outcome sensitivity were not significantly associated with overall symptom improvement. Changes in reward outcome shift, unrelated to symptoms at baseline, were also not associated with symptom change.
During loss learning at baseline, the outcome shift learning parameter showed a negative association with negative affect. Increases in loss outcome shift after CBT were significantly associated with overall symptom improvement, including improved negative affect (loss outcome shift: posterior mean regression β = 0.42; 90% CrI, 0.09 to 0.77; Figure 3B; Table 2). Changes in loss learning rate and outcome sensitivity, which were unrelated to symptoms at baseline, were also not associated with changes in overall symptoms.
For reward learning, at baseline, anhedonia moderated associations between prediction error and expected value signaling in striatum; following CBT, participants with depression with high anhedonia showed a significant change in the correlation between ventral striatum signaling to prediction error and expected value from pretreatment to posttreatment (Fisher r to z = 1.65; 1-tailed P = .05; eFigure 11A in the Supplement). In participants without depression, this correlation was stable across time (z = 0.46; P = .65), and the overall interaction across all participants was significant (interaction of baseline anhedonia with the association between changes in expected value and prediction error signals in ventral striatum: t43 = 1.85; 1-tailed P = .04), indicating a shift only in participants with high anhedonia. For loss learning, changes in prediction error signaling in subgenual anterior cingulate cortex (related to negative affect at baseline) were not associated with improvements in negative affect (eFigure 11B in the Supplement).
Here, we used a computational model of reinforcement learning to distinguish among learning processes in participants with and without depression and showed, across neural and behavioral levels, associations of anhedonia with reduced updating but greater differentiation of rewards (captured by reward learning rate and outcome sensitivity parameters, respectively) and of negative affect with more negative valuation of losses (captured by the loss outcome shift parameter). Broad symptom improvement after CBT, including improved anhedonia and negative affect, was associated with normalization of reward learning rate and loss outcome shift disruptions, respectively.
Similar to other studies with large patient samples,25,54,56 for reward learning, we found no support of a reduction in valuation with anhedonia but rather a moderation of neural expected value prediction error correlations in ventral striatum. These results suggest that highly anhedonic individuals paradoxically process large rewards as more rewarding but then fail to update future reward expectations and are consistent with previous findings of increased immediate responsivity but reduced long-term effects of rewards in individuals with depression.57,58 During loss learning, participants higher in negative affect showed more negative valuation of outcomes and no learning rate variation, suggesting that maladaptive overresponsivity to losses observed in individuals with depression12,15,20 may be due to valuing negative feedback more negatively and not to overadjusting following negative feedback. Overall, our findings of altered learning processes at baseline show that depression impairs aspects of reward learning and loss learning but in potentially distinct ways: poor reward learning is because of slow updating while disrupted loss learning results from pessimistic valuation of outcomes. Of interest, the increased reward outcome sensitivity in participants with depression with greater anhedonia is discrepant from previous reports10,59 that included participants lower in anhedonia or smaller patient samples not permitting dimensional analyses. Future studies are warranted to fully delineate the nature and impact of associations between anhedonia and outcome sensitivity in depression.
Of clinical importance is whether reinforcement learning processes are sensitive to treatment, which would indicate a potential causal relationship between learning changes and symptom improvement. We indeed found that symptom change following CBT was correlated with remediation of altered learning parameters and neural responses. Specifically, greater symptom improvement was accompanied by increased reward learning rate, a normalized association between neural signals of expected value and prediction error during reward learning, and a more positive outcome shift during loss learning. The learning changes have conceptual overlap with CBT’s focus on challenging negative evaluations and reflecting on outcomes of pleasurable activities.49 These associations suggest that model-derived learning parameters go beyond describing alterations at baseline and are sensitive to treatment-induced changes in symptoms. The data may thus inform the future development and testing of symptom-based or parameter-based therapies that directly target behavioral or neural circuits involved in reinforcement learning. Model-derived learning parameters may also be used to improve outcomes and tailor extant treatments to individual symptom presentations (eg, focusing on updating reward expectations in patients high in anhedonia vs more positive valuation of negative outcomes in patients high in negative affect).
The limitations of this work warrant attention. First, although our sample size was comparatively large, even larger samples would ensure adequate power to detect smaller effects, particularly with more conservative Bayesian multilevel analyses,53,60 and to statistically dissociate changes in specific symptom clusters after treatment. Second, while comparable with other work of this scope, our exclusion rate owing to issues with scanning or behavioral data was relatively high and may affect the generalizability of results. Indeed, although participants with depression were not excluded more often than those without depression, excluded participants did have lower estimated IQ. In addition, larger, more clinically heterogenous samples may be needed to detect symptom subscale–specific changes or reward outcome sensitivity changes following CBT. Future work may also clarify the specificity and sensitivity of learning parameters to change by comparing changes in associations of learning with symptoms between CBT and other treatments or natural variation in symptoms over time.
In individuals with depression, associations between symptoms and disrupted valuation have long been hypothesized but difficult to dissociate. By parsing components of value-based learning in a large, well-characterized sample, we show that associations of learning with symptoms in individuals with depression are present and may vary by valence and learning process. The remediation of computational model–identified learning processes associated with symptom changes after CBT suggest a mechanistic role of learning disruptions in those with depression. More broadly, this work may provide a bridge between behaviorally oriented clinicians and computational (neuro)scientists toward novel integrative ways for understanding and treating depression.
Accepted for Publication: May 21, 2021.
Published Online: July 28, 2021. doi:10.1001/jamapsychiatry.2021.1844
Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2021 Brown VM et al. JAMA Psychiatry.
Corresponding Author: Pearl H. Chiu, PhD (firstname.lastname@example.org), and Brooks King-Casas, PhD (email@example.com), Fralin Biomedical Research Institute at VTC, Virginia Tech, 2 Riverside Cir, Roanoke, VA 24016.
Author Contributions: Drs Brown and Chiu had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Brown, Wang, King-Casas, Chiu.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: Brown, Chiu.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Brown, Zhu, Solway, Wang, Chiu.
Obtained funding: Chiu.
Administrative, technical, or material support: King-Casas, Chiu.
Study supervision: King-Casas, Chiu.
Conflict of Interest Disclosures: None reported.
Funding/Support: This research was supported in part by the University of Pittsburgh Center for Research Computing. This work was funded in part by the National Institute of Mental Health (grants MH087692 and MH106756 to Dr Chiu; grant MH122626 to Dr Brown; and grant MH115221 to Dr King-Casas) and the Natural National Science Foundation of China (grant NSFC 32071095 to Dr Zhu).
Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Additional Contributions: We gratefully acknowledge our study therapists (Jeanne Berger, PhD; Elizabeth Erwin, MS; Susan Marney, EdD; Rick Milan, PhD; Lou Perrott, PhD; Bruce Sellars, PsyD; and Carol Stockton, MA) and assistance with study implementation and data management from Jessica Eiseman, MS; Kat Gardner, MA; Jacob Lee, MS; Kathleen McLachlan, BS; Rob McNamara, PhD; Jennifer Nguyen, BS; Riley Palmer, BA; Andre Plate, BS; Lauren Reckling, BS; and Cari Rosoff, BA. All contributors were compensated for their work.
Create a personal account or sign in to: