[Skip to Content]
Access to paid content on this site is currently suspended due to excessive activity being detected from your IP address 18.206.194.83. Please contact the publisher to request reinstatement.
[Skip to Content Landing]
Figure 1.
Factors Associated With Reliable Improvement–All Sessions
Factors Associated With Reliable Improvement–All Sessions

Forest plot of logistic regression model investigating association between mean number of words per feature across treatment and reliable improvement. Standardized odds ratios and 95% confidence intervals are shown (and listed in the right column). Adjusted for total number of sessions, symptom severity, patient sex, age, medication status, presence of long-term condition, and session duration.

aP < .001.

bP < .01.

cP < .05.

Figure 2.
First-Session Factors Associated With IAPT Engagement
First-Session Factors Associated With IAPT Engagement

Forest plot of logistic regression model investigating association between mean number of words per feature in the first treatment session and patient engagement. Standardized odds ratios and 95% confidence intervals are shown (and listed in the right column). Adjusted for symptom severity, patient sex, age, medication status, presence of long-term condition, and session duration.

aP < .001.

bP < .01.

cP < .05.

Table 1.  
Factors Associated With Reliable Improvement–All Sessionsa
Factors Associated With Reliable Improvement–All Sessionsa
Table 2.  
First-Session Factors Associated With IAPT Engagementa
First-Session Factors Associated With IAPT Engagementa
1.
Kilbourne  AM, Beck  K, Spaeth-Rublee  B,  et al.  Measuring and improving the quality of mental health care: a global perspective.  World Psychiatry. 2018;17(1):30-38. doi:10.1002/wps.20482PubMedGoogle Scholar
2.
Johnsen  TJ, Friborg  O.  The effects of cognitive behavioral therapy as an anti-depressive treatment is falling: a meta-analysis.  Psychol Bull. 2015;141(4):747-768. doi:10.1037/bul0000015PubMedGoogle Scholar
3.
Holmes  EA, Ghaderi  A, Harmer  CJ,  et al.  The Lancet Psychiatry Commission on psychological treatments research in tomorrow’s science.  Lancet Psychiatry. 2018;5(3):237-286. doi:10.1016/S2215-0366(17)30513-8PubMedGoogle Scholar
4.
Waller  G, Turner  H.  Therapist drift redux: why well-meaning clinicians fail to deliver evidence-based therapy, and how to get back on track.  Behav Res Ther. 2016;77:129-137. doi:10.1016/j.brat.2015.12.005PubMedGoogle Scholar
5.
Perepletchikova  F.  On the topic of treatment integrity.  Clin Psychol (New York). 2011;18(2):148-153. doi:10.1111/j.1468-2850.2011.01246.xPubMedGoogle Scholar
6.
Essock  SM, Covell  NH, Weissman  EM.  Inside the black box: the importance of monitoring treatment implementation.  Schizophr Bull. 2004;30(3):613-615. doi:10.1093/oxfordjournals.schbul.a007107PubMedGoogle Scholar
7.
Beck  AT.  Cognitive therapy: nature and relation to behavior therapy.  Behav Ther. 1970;1:184-200. doi:10.1016/S0005-7894(70)80030-2PubMedGoogle Scholar
8.
Ellis  A.  Reason and Emotion in Psychotherapy. New York: Lyle Stuart; 1962.
9.
Perepletchikova  F, Treat  TA, Kazdin  AE.  Treatment integrity in psychotherapy research: analysis of the studies and examination of the associated factors.  J Consult Clin Psychol. 2007;75(6):829-841. doi:10.1037/0022-006X.75.6.829PubMedGoogle Scholar
10.
Barth  J, Munder  T, Gerger  H,  et al.  Comparative efficacy of seven psychotherapeutic interventions for patients with depression: a network meta-analysis.  PLoS Med. 2013;10(5):e1001454. doi:10.1371/journal.pmed.1001454PubMedGoogle Scholar
11.
Blagys  MD, Hilsenroth  MJ.  Distinctive activities of cognitive-behavioral therapy: a review of the comparative psychotherapy process literature.  Clin Psychol Rev. 2002;22(5):671-706. doi:10.1016/S0272-7358(01)00117-9PubMedGoogle Scholar
12.
Mulder  R, Murray  G, Rucklidge  J.  Common versus specific factors in psychotherapy: opening the black box.  Lancet Psychiatry. 2017;4(12):953-962. doi:10.1016/S2215-0366(17)30100-1PubMedGoogle Scholar
13.
Doss  BD, Atkins  DC.  Investigating treatment mediators when simple random assignment to a control group is not possible.  Clin Psychol Sci Pract. 2006;13:321-336. doi:10.1111/j.1468-2850.2006.00045.xGoogle Scholar
14.
Bell  EC, Marcus  DK, Goodlad  JK.  Are the parts as good as the whole? a meta-analysis of component treatment studies.  J Consult Clin Psychol. 2013;81(4):722-736. doi:10.1037/a0033004PubMedGoogle Scholar
15.
Lorenzo-Luaces  L, German  RE, DeRubeis  RJ.  It’s complicated: the relation between cognitive change procedures, cognitive change, and symptom change in cognitive therapy for depression.  Clin Psychol Rev. 2015;41:3-15. doi:10.1016/j.cpr.2014.12.003PubMedGoogle Scholar
16.
Kessler  D, Lewis  G, Kaur  S,  et al.  Therapist-delivered internet psychotherapy for depression in primary care: a randomised controlled trial.  Lancet. 2009;374(9690):628-634. doi:10.1016/S0140-6736(09)61257-5PubMedGoogle Scholar
17.
National Institute for Health and Clinical Excellence. Common mental health disorders: the NICE guideline on identification and pathways to care. London, England: RCPsych Publication; 2011.
18.
Roth  AD, Pilling  S.  Using an evidence-based methodology to identify the competences required to deliver effective cognitive and behavioural therapy for depression and anxiety disorders..  Behav Cogn Psychother. 2008;36(February):129-147. doi:10.1017/S1352465808004141Google Scholar
19.
Clark  DM.  Implementing NICE guidelines for the psychological treatment of depression and anxiety disorders: the IAPT experience.  Int Rev Psychiatry. 2011;23(4):318-327. doi:10.3109/09540261.2011.606803PubMedGoogle Scholar
20.
N.I.C.E. Principles for Best Practice in Clinical Audit. Radcliffe Medical Press; 2002. http://www.uhbristol.nhs.uk/files/nhs-ubht/best_practice_clinical_audit.pdf. Accessed January 31, 2019.
21.
Kroenke  K, Spitzer  RL, Williams  JB.  The PHQ-9: validity of a brief depression severity measure.  J Gen Intern Med. 2001;16(9):606-613. doi:10.1046/j.1525-1497.2001.016009606.xPubMedGoogle Scholar
22.
Spitzer  RL, Kroenke  K, Williams  JB, Löwe  B.  A brief measure for assessing generalized anxiety disorder: the GAD-7.  Arch Intern Med. 2006;166(10):1092-1097. doi:10.1001/archinte.166.10.1092PubMedGoogle Scholar
23.
Blackburn  I, James  IA, Milne  DL,  et al.  The revised cognitive therapy scale (CTS-R): psychometric properties.  Behav Cogn Psychother. 2001;29:431-446. doi:10.1017/S1352465801004040Google Scholar
24.
Landis  JR, Koch  GG.  The measurement of observer agreement for categorical data.  Biometrics. 1977;33(1):159-174. doi:10.2307/2529310PubMedGoogle Scholar
25.
Alford  BA, Beck  AT.  Cognitive therapy of delusional beliefs.  Behav Res Ther. 1994;32(3):369-380. doi:10.1016/0005-7967(94)90134-1PubMedGoogle Scholar
26.
Kazantzis  N, Whittington  C, Zelencich  L, Kyrios  M, Norton  PJ, Hofmann  SG.  Quantity and quality of homework compliance: a meta-analysis of relations with outcome in cognitive behavior therapy.  Behav Ther. 2016;47(5):755-772. doi:10.1016/j.beth.2016.05.002PubMedGoogle Scholar
27.
Beck  JS.  Cognitive Behavior Therapy: Basics and Beyond. 2nd ed. New York, NY: Guilford Press; 2011.
28.
Alford  BA, Beck  AT.  The Integrative Power of Cognitive Therapy. New York, NY: Guilford Press; 1997. doi:10.1891/0889-8391.11.4.309
29.
Wampold  BE.  How important are the common factors in psychotherapy? an update.  World Psychiatry. 2015;14(3):270-277. doi:10.1002/wps.20238PubMedGoogle Scholar
30.
Martin  DJ, Garske  JP, Davis  MK.  Relation of the therapeutic alliance with outcome and other variables: a meta-analytic review.  J Consult Clin Psychol. 2000;68(3):438-450. doi:10.1037/0022-006X.68.3.438PubMedGoogle Scholar
31.
Strunk  DR, Brotman  MA, DeRubeis  RJ.  The process of change in cognitive therapy for depression: predictors of early inter-session symptom gains.  Behav Res Ther. 2010;48(7):599-606. doi:10.1016/j.brat.2010.03.011PubMedGoogle Scholar
32.
Catarino  A, Bateup  S, Tablan  V,  et al.  Demographic and clinical predictors of response to internet-enabled cognitive-behavioural therapy for depression and anxiety.  BJPsych Open. 2018;4(5):411-418. doi:10.1192/bjo.2018.57PubMedGoogle Scholar
33.
Waller  G.  Evidence-based treatment and therapist drift.  Behav Res Ther. 2009;47(2):119-127. doi:10.1016/j.brat.2008.10.018PubMedGoogle Scholar
34.
Shapiro  DA, Shapiro  D.  Meta-analysis of comparative therapy outcome studies: a replication and refinement.  Psychol Bull. 1982;92(3):581-604. doi:10.1037/0033-2909.92.3.581PubMedGoogle Scholar
Limit 200 characters
Limit 25 characters
Conflicts of Interest Disclosure

Identify all potential conflicts of interest that might be relevant to your comment.

Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.

Err on the side of full disclosure.

If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.

Not all submitted comments are published. Please see our commenting policy for details.

Limit 140 characters
Limit 3600 characters or approximately 600 words
    Views 14,698
    Citations 0
    Original Investigation
    August 22, 2019

    Quantifying the Association Between Psychotherapy Content and Clinical Outcomes Using Deep Learning

    Author Affiliations
    • 1Clinical Science Laboratory, Ieso Digital Health, Cambridge, England
    JAMA Psychiatry. Published online August 22, 2019. doi:10.1001/jamapsychiatry.2019.2664
    Key Points

    Question  What aspects of psychotherapy content are significantly associated with clinical outcomes?

    Findings  In this quality improvement study, a deep learning model was trained to automatically categorize therapist utterances from approximately 90 000 hours of internet-enabled cognitive behavior therapy (CBT). Increased quantities of CBT change methods were positively associated with reliable improvement in patient symptoms, and the quantity of nontherapy-related content showed a negative association.

    Meaning  The findings support the key principles underlying CBT as a treatment and demonstrate that applying deep learning to large clinical data sets can provide valuable insights into the effectiveness of psychotherapy.

    Abstract

    Importance  Compared with the treatment of physical conditions, the quality of care of mental health disorders remains poor and the rate of improvement in treatment is slow, a primary reason being the lack of objective and systematic methods for measuring the delivery of psychotherapy.

    Objective  To use a deep learning model applied to a large-scale clinical data set of cognitive behavioral therapy (CBT) session transcripts to generate a quantifiable measure of treatment delivered and to determine the association between the quantity of each aspect of therapy delivered and clinical outcomes.

    Design, Setting, and Participants  All data were obtained from patients receiving internet-enabled CBT for the treatment of a mental health disorder between June 2012 and March 2018 in England. Cognitive behavioral therapy was delivered in a secure online therapy room via instant synchronous messaging. The initial sample comprised a total of 17 572 patients (90 934 therapy session transcripts). Patients self-referred or were referred by a primary health care worker directly to the service.

    Exposures  All patients received National Institute for Heath and Care Excellence–approved disorder-specific CBT treatment protocols delivered by a qualified CBT therapist.

    Main Outcomes and Measures  Clinical outcomes were measured in terms of reliable improvement in patient symptoms and treatment engagement. Reliable improvement was calculated based on 2 severity measures: Patient Health Questionnaire (PHQ-9)21 and Generalized Anxiety Disorder 7-item scale (GAD-7),22 corresponding to depressive and anxiety symptoms respectively, completed by the patient at initial assessment and before every therapy session (see eMethods in the Supplement for details).

    Results  Treatment sessions from a total of 14 899 patients (10 882 women) aged between 18 and 94 years (median age, 34.8 years) were included in the final analysis. We trained a deep learning model to automatically categorize therapist utterances into 1 or more of 24 feature categories. The trained model was applied to our data set to obtain quantifiable measures of each feature of treatment delivered. A logistic regression revealed that increased quantities of a number of session features, including change methods (cognitive and behavioral techniques used in CBT), were associated with greater odds of reliable improvement in patient symptoms (odds ratio, 1.11; 95% CI, 1.06-1.17) and patient engagement (odds ratio, 1.20, 95% CI = 1.12-1.27). The quantity of nontherapy-related content was associated with reduced odds of symptom improvement (odds ratio, 0.89; 95% CI, 0.85-0.92) and patient engagement (odds ratio, 0.88, 95% CI, 0.84-0.92).

    Conclusions and Relevance  This work demonstrates an association between clinical outcomes in psychotherapy and the content of therapist utterances. These findings support the principle that CBT change methods help produce improvements in patients’ presenting symptoms. The application of deep learning to large clinical data sets can provide valuable insights into psychotherapy, informing the development of new treatments and helping standardize clinical practice.

    Introduction

    Compared with treatment of physical conditions, the quality of care of mental health disorders remains poor, and the rate of improvement in treatment is slow.1 Outcomes for many mental disorders have stagnated or even declined since the original treatments were developed.2,3 A primary reason for the gap in quality of care is the lack of systematic methods for measuring the delivery of psychotherapy.1 As with any evidence-based intervention, to be effective, treatment needs to be delivered as intended (also known as treatment integrity),4,5 which requires accurate measurement of treatment delivered.6 However, while it is relatively simple to monitor the delivery of most medical treatments (eg, the dosage of a prescribed drug), psychotherapeutic treatments are a series of private discussions between the patient and clinician. As such, monitoring the delivery of this type of treatment to the same extent as physical medicine would require infrastructure and resources beyond the scope of most health care systems.

    The National Institute for Heath and Care Excellence and the American Psychological Association recommend cognitive behavioral therapy (CBT) as a treatment for most common mental health problems such as depression and anxiety-related disorders. Cognitive behavioral therapy refers to a class of psychotherapeutic interventions informed by the principle that mental disorders are maintained by cognitive and behavioral phenomena and that modifying these maintaining factors helps produce enduring improvements in patients’ presenting symptoms.7,8 Despite its widespread use, the Improving Access to Psychological Therapies (IAPT) program in England includes no objective measure of treatment integrity for CBT, and it has been proposed that only 3.5% of psychotherapy randomized clinical trials use adequate treatment integrity procedures.9

    Understanding how CBT works is of particular interest given that the relative effects of different psychotherapeutic interventions appear similar.10 Thus, whether treatments work through specific factors (eg, CBT change methods) or factors common to most psychotherapies (eg, therapeutic alliance) remains a core issue in the field.11,12 Studies commonly use observational coding methods (eg, ratings/transcription of recorded therapeutic conversations) to investigate the association between treatment delivered and outcomes.5 Owing to the resource-intensive nature of this method, studies typically focus on a small number of therapeutic components in a relatively small sample of patients. As with many randomized clinical trials, the results of such interventions are difficult to transfer to real-world psychotherapy13 and require sample sizes larger than typically used.14 To determine the most effective components of CBT and whether CBT works via the mechanisms proposed by the approach,15 quantifiable measures of treatment delivered need to be obtained in a natural clinical context and be gathered from a sufficiently large enough sample to draw meaningful conclusions.

    Here, we used a large-scale data set containing session transcripts from more than 14 000 patients receiving internet-enabled CBT (IECBT) (approximately 90 000 hours of therapy). In IECBT, a patient communicates with a qualified CBT therapist using a real-time text-based message system. Internet-enabled CBT has been shown to be clinically effective for the treatment of depression16 and is currently deployed within IAPT. Using a deep learning approach, we developed a model to automatically categorize therapist utterances according to the role that they play in therapy, generating a quantifiable measure of treatment delivered. We then investigated the association between the quantity of each aspect of therapy delivered and clinical outcomes.

    Methods
    Design

    Data were obtained from patients receiving IECBT for the treatment of a mental health disorder between June 2012 and March 2018. Internet-enabled CBT was delivered using a commercial package currently used in the English National Health Service, provided by Ieso Digital Health (https://www.iesohealth.com/), following internationally recognized standards for information security (ISO 27001; https://www.iesohealth.com/en-gb/legal/iso-certificates). The National Institute for Heath and Care Excellence approved disorder-specific CBT treatment protocols,17 based on Roth and Pilling CBT competences framework,18 were delivered in a secure online therapy room via instant synchronous messaging by a British Association for Behavioral and Cognitive Psychotherapies–accredited CBT therapist (see eFigure 1 in the Supplement for a realistic example of a therapy conversation). Patients self-referred or were referred by a primary health care worker directly to the service.

    The IAPT program is a large-scale initiative aimed at increasing access to evidence-based psychological therapy for common mental health disorders within the English National Health Service.19 The information captured through IAPT’s minimum data set is intended to support monitoring of implementation and effectiveness of national policy/legislation, performance analysis and benchmarking, and national audit of IAPT services. As determined by the National Health Service, and per The National Institute of Health and Care Excellence principles,20 clinical audit studies within the IAPT framework do not require additional patient consent or ethical approval.20 When registering to use the Ieso service, patients provide written informed consent as part of a privacy policy agreement, allowing the service to use their anonymized data for audit purposes and to support research, including academic publications.

    Clinical Outcomes

    Clinical outcomes were defined according to IAPT guidelines19 and were measured in terms of reliable improvement and IAPT engagement and included as binary measures (ie, 0 or 1). A patient was classed as engaged if they attended 2 or more treatment sessions. Reliable improvement was calculated based on 2 severity measures: Patient Health Questionnaire (PHQ-9)21 and Generalized Anxiety Disorder 7-item scale (GAD-7),22 corresponding to depressive and anxiety symptoms respectively, completed by the patient at initial assessment and before every therapy session (see eMethods in the Supplement for details).

    Therapy Feature Categories

    We defined a total of 24 feature categories (Box), informed by the CBT competences framework18 and the Revised Cognitive Therapy Scale.23 A research psychologist (M.P.E.) annotated 290 therapy session transcripts, under the guidance of a qualified clinical therapist (S.B.), tagging each therapist text-message utterance as belonging to 1 (or more) of 19 features, with 5 features tagged using regular expressions (see eTable 1 in the Supplement for a full description). A deep learning model (see eMethods in the Supplement) was trained on the annotated utterances and then used to automatically classify all utterances in the full data set into 1 or more of 24 feature categories. Model accuracy is detailed in eTable 2 in the Supplement. To obtain a measure of interrater agreement, a second psychologist (S.B.) annotated a subsample of the transcripts. The interrater reliability was κ = 0.54 (a value of 0.4-0.6 is considered moderate agreement, with zero equaling chance agreement24; eTable 3 in the Supplement).

    Box Section Ref ID
    Box.

    Feature Categories Used in Transcript Annotation

    Therapy Feature Categories
    • Hello

    • Mood check

    • Obtain update

    • Bridge

    • Risk check

    • Set agenda

    • Review homework

    • Set goals

    • Formulation

    • Change methods

    • Perceptions of change

    • Setting homework

    • Planning for the future

    • Elicit feedback

    • Summarize session

    • Give feedback

    • Arrange next session

    • Goodbye

    • Socratic questioninga

    • Therapeutic thanksa

    • Therapeutic empathya

    • Therapeutic praisea

    • Collaborationa

    • Other

    a Features tagged using regular expressions.

    Statistical Analysis

    Using the output of the model, the mean number of words for each feature, averaged across all sessions, was calculated for each case. The final treatment session was excluded because outcome measures are taken prior to the commencement of each treatment session. The initial sample comprised a total of 90 934 session transcripts taken from 17 572 patients, with a reliable improvement rate of 63.4% and IAPT engagement rate of 87.3%.

    All analyses were performed in R (the R Foundation). Cases with missing start or end PHQ-9 or GAD-7 scores (n = 1338) were excluded from the analysis. We performed 3 multivariable logistic regression analyses. First, a multivariable logistic regression was performed to investigate the association between session features and reliable improvement. Predictor variables were the mean number of words for each feature across sessions plus patient demographics: starting PHQ-9 and GAD-7 scores, sex (male, female, or unstated/unknown), age, whether the patient had a long-term physical condition (yes, no, or unstated/unknown), and whether the patient was taking psychotropic medication at the start of treatment (prescribed not taking, prescribed taking, not prescribed, or unstated/unknown). The number of sessions completed and the mean duration of sessions were also included. Cases with a mean of fewer than 50 patient words were excluded (n = 16), leaving a total of 13 073 patients (at a clinical caseness threshold and engaged in treatment) in the analysis.

    We also investigated the association between first-session features and IAPT engagement. Predictor variables were the number of each therapy feature in the first session, patient demographics, and duration of first session. Sessions with a total of fewer than 50 patient words were excluded (n = 121) making a total of 14 899 patients, at caseness.

    Details of a logistic regression analysis investigating the association between first-session features and outcomes can be found in eResults and eTable 5 in the Supplement. Details of diagnoses for patients included in the analysis can be found in eTable 4 in the Supplement. Patient demographic information is shown in Tables 1 and 2 and eTable 5 in the Supplement.

    For all analyses, continuous predictor variables were scaled and centered to the mean. Statistical significance was defined as P less than .05 two-tailed, uncorrected. Multicollinearity analyses revealed that variance inflation factors were smaller than 2 for all predictor variables, confirming that regression models were not affected by the presence of multicollinearity.

    Results
    Factors Associated With Reliable Improvement Across Treatment

    Figure 1 shows the standardized odds ratios (ORs) for each therapy feature included in the multivariable logistic regression (Table 1). The results revealed increased quantities of “therapeutic praise” (OR, 1.21; 95% CI, 1.15-1.27), “planning for the future” (OR, 1.12; 95% CI, 1.06-1.19), “perceptions of change” (OR, 1.11; 95% CI, 1.06-1.16), “change methods” (OR, 1.11; 95% CI, 1.06-1.17), “set agenda” (OR, 1.08; 95% CI, 1.02-1.14), “elicit feedback” (OR, 1.06; 95% CI, 1.02-1.11), “give feedback” (OR, 1.05; 95% CI, 1.00-1.10), and “review homework” (OR, 1.04; 95% CI, 1.00-1.09) were all associated with greater odds of reliable improvement. By contrast, increases in nontherapy-related content (“other” [OR, 0.89; 95% CI, 0.85-0.92], “hello” [OR, 0.92; 95% CI, 0.88-0.96], and “goodbye” [OR, 0.95; 95% CI, 0.91-0.99]), along with “therapeutic empathy” (OR, 0.84; 95% CI, 0.81-0.88), “risk check” (OR, 0.85; 95% CI, 0.81-0.89), and “bridge” (OR, 0.95; 95% CI, 0.91-0.98) were negatively associated with improvement.

    Patient variables of starting GAD-7 score (OR, 1.29; 95% CI, 1.23-1.34), not being prescribed medication (OR, 1.23; 95% CI, 1.06-1.41), patient age (OR, 1.16; 95% CI, 1.12-1.22), and total number of treatment sessions (OR, 1.22; 95% CI, 1.17-1.27) were also associated with increased odds of improvement. Starting PHQ-9 score (OR, 0.95; 95% CI, 0.91-0.99), the presence of a long-term medical condition (OR, 0.72; 95% CI, 0.66-0.88), and longer session durations (OR, 0.95; 95% CI, 0.91-0.99) were associated with reduced odds of improvement.

    Factors Associated With IAPT Engagement in First Treatment Session

    Figure 2 shows the standardized ORs for each session feature included in the multivariable logistic regression (Table 2). We found that “change methods” (OR, 1.20; 95% CI. 1.12-1.27), “elicit feedback” (OR, 1.09; 95% CI. 1.03-1.16), “set homework” (OR, 1.09; 95% CI. 1.03-1.16),” arrange next session” (OR, 1.17; 95% CI. 1.10-1.24), “therapeutic thanks” (OR, 1.13; 95% CI. 1.06-1.20), and “formulation” (OR, 1.10; 95% CI, 1.04-1.17) were associated with increased odds of IAPT engagement. By contrast, nontherapy-related content (“other” and “hello”) showed a negative association (“other” OR, 0.88; 95% CI, 0.84-0.92; “hello” OR, 0.93; 95% CI, 0.88-0.99), as did “therapeutic empathy” (OR, 0.93; 95% CI, 0.88-0.97), “Socratic questioning” (OR, 0.94; 95% CI, 0.89-0.99), “bridge” (OR, 0.94; 95% CI, 0.90-0.98), and “planning for the future” (OR, 0.93; 95% CI, 0.89-0.96). Patient age (OR, 1.07; CI, 1.02-1.13), not being prescribed medication (OR, 1.21; 95% CI, 1.02-1.44), being prescribed and taking medication (OR, 1.20; 95% CI, 1.01-1.47), and duration of the first session (OR, 1.26; CI, 1.20-1.33) were positively associated with IAPT engagement, while starting PHQ-9 score (OR, 0.87; CI, 0.82-0.92) was negatively associated.

    Discussion

    Improving the quality and efficacy of psychotherapy requires that treatment be delivered as intended; however, monitoring and measuring treatment delivered presents a substantial challenge. We developed a method of objectively quantifying psychotherapy using a deep learning approach to automatically categorize therapist utterances from approximately 90 000 hours of IECBT. We find that factors specific to CBT, as well as factors common to most psychotherapies, are associated with increased odds of reliable improvement in patient symptoms.

    The results revealed a positive association between the quantity of CBT change method-related content and both reliable improvement and IAPT engagement. This finding supports the key principles underlying CBT and provides validation for CBT as a treatment (ie, modifying cognitive and behavioral factors produces improvements in patient symptoms). Here, the category of “change methods” included any example of cognitive or behavioral reattribution, skill-teaching, conceptualization, or psychoeducation. Thus, further research is needed to determine the association between different types of change method and outcomes.15

    Homework in CBT is used to help patients practice skills learned in therapy and generalize these skills to the real world.25 Increased content related to reviewing homework was positively associated with symptom improvement, while setting homework in the first session was associated with increased engagement. It is unclear whether an increase in reviewing homework plays a causal role in symptom change or whether it reflects a patient who has completed homework; however, these findings accord with evidence that out-of-session homework is important in determining outcomes in CBT.26 The results show that agenda setting is also positively associated with reliable improvement. Agenda setting involves the therapist and patient deciding on the topics to be discussed during the session. However, we are unable to determine whether the agenda was adhered to in the session. The results also support the principle that giving and eliciting feedback helps both the therapist and patient develop a greater understanding of key issues and possibly strengthens the therapeutic alliance.27

    Session content related to planning for the future after therapy and discussing perceptions of change was also positively associated with improvement. A discussion of perceptions of change is only likely to occur following some degree of change; similarly, planning for a future most likely occurs when patients are close to completing treatment and/or have moved toward improvement. As such, the increased occurrence of both features is likely to be reflective of treatment progressing well. Consistent with this, neither feature was significantly associated with outcomes in the first treatment session (eTable 5 in the Supplement). By contrast, goal setting in the first session was positively associated with improvement, supporting the goal-directed nature of CBT.27 Content associated with formulation (ie, the beliefs and behavioral strategies that characterize a disorder)28 in the first session also showed a positive association with IAPT engagement (and a borderline significant association with improvement), suggesting that placing patients’ experiences within a cognitive behavioral framework early in therapy is beneficial.

    Several features were found to be negatively associated with outcomes, in particular nontherapy-related content. Content that did not fall within any of the other 23 categories (“other”) includes utterances related to technical/practical matters or nontherapeutic advice/conversations. While greetings and goodbyes are essential to the structure of a therapy session, our results indicate that, when aggregated across sessions, an excessive or disproportionate amount of time spent on such nontherapeutic aspects may reduce the quantity of active intervention. Importantly, this suggests that rather than the quantity of conversation, it is the therapeutic nature of conversation and/or the dosage of therapy delivered in a session that is associated with improvement in patient symptoms.

    Risk checking also showed a strong negative association with reliable improvement. We believe this is likely to be reflective of patients with more complex problems who report more thoughts of self-harm. The quantity of risk checks will increase if a patient confirms that they feel at risk; thus, it is important to recognize that increased risk-checking content is essential and unavoidable. An extended period focused on risk is also likely to cause a deviation in the structure of the session and a subsequent reduction in the dosage of active therapy delivered.

    A central issue in psychotherapy research is whether different approaches work through specific factors or factors that are common to most psychotherapies. Here, we find a positive association between improvement and/or IAPT engagement for each of 6 techniques identified as distinguishing CBT from psychodynamic therapy.11 Common factors, such as therapeutic alliance, are thought to play a role in all psychotherapeutic treatments29 and show a moderate association with outcomes.30 Here, we found that “therapeutic praise” was positively associated with improvement, whereas “therapeutic empathy” showed a negative association. Rather than playing a causal role in outcomes, we believe increased empathy is likely to be indicative of a patient reporting a greater number of problems. Similarly, increased praise may be reflective of a patient responding well to treatment. Further research is required to determine the causal association between therapeutic alliance and outcomes, although previous work indicates therapeutic alliance may be reflective of a change in symptoms.31

    We also investigated the association between patient variables and outcomes. Patient age (older patients showing better outcomes), absence of a long-term medical condition, not being prescribed psychotropic medication, and severity of anxiety symptoms were all positively associated with reliable improvement. By contrast, severity of depressive symptoms, the presence of a long-term medical condition, and being prescribed psychotropic medication were negatively associated. These results accord with previous work investigating treatment outcomes in a sample of approximately 3000 patients receiving IECBT.32 Both studies report a positive association between GAD-7 scores and reliable improvement. Further work is needed to determine whether this reflects a greater association of CBT with short-term symptoms of anxiety and/or whether this effect may be specific to IECBT.

    Limitations

    A limitation of our approach is that it is not possible to determine whether a therapeutic feature is applied in an appropriate manner or whether a therapist adheres to the CBT protocol. It should be noted that the model provides a measure of the association between features and outcomes across sessions rather than measuring the quality of an individual session. Thus, future work needs to build on this approach to generate a validated model of session quality/adherence, alongside further refinement of the annotation guidelines and pooling of annotations. In addition, the model does not assess how the treatment was received by patients. To partly address this, we are currently developing procedures to quantify patient utterances, enabling us to determine, for example, how use of change methods are associated with a change in patient’s cognitions and whether therapeutic empathy is positively associated with outcomes after adjusting for the number of problems expressed by the patient.

    We emphasize that our results only reveal the presence of an association between therapy content and outcomes, although some aspects of therapy (eg, change methods) are typically initiated by the therapist and appear likely to play a causal role. Further work is needed to determine the casual relationship between therapy features and outcomes by focusing on the temporal association between content and symptom change. Given the limited outcomes measures available, we are also unable to address the association between therapy content and long-term improvements in symptoms. In addition, other patient factors not included are likely to play a role in determining outcomes. Finally, it should be noted that for large data sets, the ORs and confidence intervals should be considered more informative of the clinical importance of a feature than statistical significance alone.

    Conclusions

    At present, the detailed monitoring of therapist performance requires expensive and time-consuming procedures. We believe that this work represents a first step toward a practicable approach for quality controlled behavioral health care. Such monitoring could help arrest therapist drift, ie, the failure to deliver treatments a therapist has been trained to deliver, which may be one of the biggest factors contributing to poor delivery of treatment.33 Monitoring may help reverse the lower improvement rates observed in more experienced therapists.34 We note that while a typical IAPT therapist may accrue substantial experience throughout a career (approximately 30 000 therapy hours), this data set represents an accumulation of knowledge from more than 90 000 hours of CBT. Deep leaning allows us to extract this knowledge to provide valuable insights into therapy that were previously unavailable to an individual therapist. As such, we believe this approach represents an important step in developing a data-driven understanding of mental health treatment and in improving the efficacy of psychotherapy.

    Back to top
    Article Information

    Corresponding Author: Michael P. Ewbank, PhD, Clinical Science Laboratory, Ieso Digital Health, Cowley Road, The Jeffreys Building, Milton, Cambridge CB4 0DS, England (m.ewbank@iesohealth.com).

    Accepted for Publication: July 18, 2019.

    Published Online: August 22, 2019. doi:10.1001/jamapsychiatry.2019.2664

    Open Access: This is an open access article distributed under the terms of the CC-BY-NC-ND License. © 2019 Ewbank MP et al. JAMA Psychiatry.

    Author Contributions: Dr Ewbank had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Mr Martin created the utterance annotation software. Dr Cummins led the deep learning modeling.

    Concept and design: All authors.

    Acquisition, analysis, or interpretation of data: Ewbank, Cummins, Catarino, Martin, Blackwell.

    Drafting of the manuscript: Ewbank.

    Critical revision of the manuscript for important intellectual content: Cummins, Tablan, Bateup, Catarino, Martin, Blackwell.

    Statistical analysis: Ewbank, Cummins.

    Obtained funding: Tablan, Blackwell.

    Administrative, technical, or material support: Cummins, Tablan, Bateup, Catarino, Martin, Blackwell.

    Supervision: Tablan, Catarino, Blackwell.

    Conflict of Interest Disclosures: All authors are employees of Ieso Digital Health. All authors report a patent to Methods and Systems for Improved Therapy Delivery and Monitoring pending.

    Funding/Support: This study was funded by Ieso Digital Health.

    Role of the Funder/Sponsor: As employees of Ieso Digital Health, the authors were responsible for the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

    Meeting Presentation: This paper was presented at the World Congress of Psychiatry; August 22, 2019; Lisbon, Portugal.

    References
    1.
    Kilbourne  AM, Beck  K, Spaeth-Rublee  B,  et al.  Measuring and improving the quality of mental health care: a global perspective.  World Psychiatry. 2018;17(1):30-38. doi:10.1002/wps.20482PubMedGoogle Scholar
    2.
    Johnsen  TJ, Friborg  O.  The effects of cognitive behavioral therapy as an anti-depressive treatment is falling: a meta-analysis.  Psychol Bull. 2015;141(4):747-768. doi:10.1037/bul0000015PubMedGoogle Scholar
    3.
    Holmes  EA, Ghaderi  A, Harmer  CJ,  et al.  The Lancet Psychiatry Commission on psychological treatments research in tomorrow’s science.  Lancet Psychiatry. 2018;5(3):237-286. doi:10.1016/S2215-0366(17)30513-8PubMedGoogle Scholar
    4.
    Waller  G, Turner  H.  Therapist drift redux: why well-meaning clinicians fail to deliver evidence-based therapy, and how to get back on track.  Behav Res Ther. 2016;77:129-137. doi:10.1016/j.brat.2015.12.005PubMedGoogle Scholar
    5.
    Perepletchikova  F.  On the topic of treatment integrity.  Clin Psychol (New York). 2011;18(2):148-153. doi:10.1111/j.1468-2850.2011.01246.xPubMedGoogle Scholar
    6.
    Essock  SM, Covell  NH, Weissman  EM.  Inside the black box: the importance of monitoring treatment implementation.  Schizophr Bull. 2004;30(3):613-615. doi:10.1093/oxfordjournals.schbul.a007107PubMedGoogle Scholar
    7.
    Beck  AT.  Cognitive therapy: nature and relation to behavior therapy.  Behav Ther. 1970;1:184-200. doi:10.1016/S0005-7894(70)80030-2PubMedGoogle Scholar
    8.
    Ellis  A.  Reason and Emotion in Psychotherapy. New York: Lyle Stuart; 1962.
    9.
    Perepletchikova  F, Treat  TA, Kazdin  AE.  Treatment integrity in psychotherapy research: analysis of the studies and examination of the associated factors.  J Consult Clin Psychol. 2007;75(6):829-841. doi:10.1037/0022-006X.75.6.829PubMedGoogle Scholar
    10.
    Barth  J, Munder  T, Gerger  H,  et al.  Comparative efficacy of seven psychotherapeutic interventions for patients with depression: a network meta-analysis.  PLoS Med. 2013;10(5):e1001454. doi:10.1371/journal.pmed.1001454PubMedGoogle Scholar
    11.
    Blagys  MD, Hilsenroth  MJ.  Distinctive activities of cognitive-behavioral therapy: a review of the comparative psychotherapy process literature.  Clin Psychol Rev. 2002;22(5):671-706. doi:10.1016/S0272-7358(01)00117-9PubMedGoogle Scholar
    12.
    Mulder  R, Murray  G, Rucklidge  J.  Common versus specific factors in psychotherapy: opening the black box.  Lancet Psychiatry. 2017;4(12):953-962. doi:10.1016/S2215-0366(17)30100-1PubMedGoogle Scholar
    13.
    Doss  BD, Atkins  DC.  Investigating treatment mediators when simple random assignment to a control group is not possible.  Clin Psychol Sci Pract. 2006;13:321-336. doi:10.1111/j.1468-2850.2006.00045.xGoogle Scholar
    14.
    Bell  EC, Marcus  DK, Goodlad  JK.  Are the parts as good as the whole? a meta-analysis of component treatment studies.  J Consult Clin Psychol. 2013;81(4):722-736. doi:10.1037/a0033004PubMedGoogle Scholar
    15.
    Lorenzo-Luaces  L, German  RE, DeRubeis  RJ.  It’s complicated: the relation between cognitive change procedures, cognitive change, and symptom change in cognitive therapy for depression.  Clin Psychol Rev. 2015;41:3-15. doi:10.1016/j.cpr.2014.12.003PubMedGoogle Scholar
    16.
    Kessler  D, Lewis  G, Kaur  S,  et al.  Therapist-delivered internet psychotherapy for depression in primary care: a randomised controlled trial.  Lancet. 2009;374(9690):628-634. doi:10.1016/S0140-6736(09)61257-5PubMedGoogle Scholar
    17.
    National Institute for Health and Clinical Excellence. Common mental health disorders: the NICE guideline on identification and pathways to care. London, England: RCPsych Publication; 2011.
    18.
    Roth  AD, Pilling  S.  Using an evidence-based methodology to identify the competences required to deliver effective cognitive and behavioural therapy for depression and anxiety disorders..  Behav Cogn Psychother. 2008;36(February):129-147. doi:10.1017/S1352465808004141Google Scholar
    19.
    Clark  DM.  Implementing NICE guidelines for the psychological treatment of depression and anxiety disorders: the IAPT experience.  Int Rev Psychiatry. 2011;23(4):318-327. doi:10.3109/09540261.2011.606803PubMedGoogle Scholar
    20.
    N.I.C.E. Principles for Best Practice in Clinical Audit. Radcliffe Medical Press; 2002. http://www.uhbristol.nhs.uk/files/nhs-ubht/best_practice_clinical_audit.pdf. Accessed January 31, 2019.
    21.
    Kroenke  K, Spitzer  RL, Williams  JB.  The PHQ-9: validity of a brief depression severity measure.  J Gen Intern Med. 2001;16(9):606-613. doi:10.1046/j.1525-1497.2001.016009606.xPubMedGoogle Scholar
    22.
    Spitzer  RL, Kroenke  K, Williams  JB, Löwe  B.  A brief measure for assessing generalized anxiety disorder: the GAD-7.  Arch Intern Med. 2006;166(10):1092-1097. doi:10.1001/archinte.166.10.1092PubMedGoogle Scholar
    23.
    Blackburn  I, James  IA, Milne  DL,  et al.  The revised cognitive therapy scale (CTS-R): psychometric properties.  Behav Cogn Psychother. 2001;29:431-446. doi:10.1017/S1352465801004040Google Scholar
    24.
    Landis  JR, Koch  GG.  The measurement of observer agreement for categorical data.  Biometrics. 1977;33(1):159-174. doi:10.2307/2529310PubMedGoogle Scholar
    25.
    Alford  BA, Beck  AT.  Cognitive therapy of delusional beliefs.  Behav Res Ther. 1994;32(3):369-380. doi:10.1016/0005-7967(94)90134-1PubMedGoogle Scholar
    26.
    Kazantzis  N, Whittington  C, Zelencich  L, Kyrios  M, Norton  PJ, Hofmann  SG.  Quantity and quality of homework compliance: a meta-analysis of relations with outcome in cognitive behavior therapy.  Behav Ther. 2016;47(5):755-772. doi:10.1016/j.beth.2016.05.002PubMedGoogle Scholar
    27.
    Beck  JS.  Cognitive Behavior Therapy: Basics and Beyond. 2nd ed. New York, NY: Guilford Press; 2011.
    28.
    Alford  BA, Beck  AT.  The Integrative Power of Cognitive Therapy. New York, NY: Guilford Press; 1997. doi:10.1891/0889-8391.11.4.309
    29.
    Wampold  BE.  How important are the common factors in psychotherapy? an update.  World Psychiatry. 2015;14(3):270-277. doi:10.1002/wps.20238PubMedGoogle Scholar
    30.
    Martin  DJ, Garske  JP, Davis  MK.  Relation of the therapeutic alliance with outcome and other variables: a meta-analytic review.  J Consult Clin Psychol. 2000;68(3):438-450. doi:10.1037/0022-006X.68.3.438PubMedGoogle Scholar
    31.
    Strunk  DR, Brotman  MA, DeRubeis  RJ.  The process of change in cognitive therapy for depression: predictors of early inter-session symptom gains.  Behav Res Ther. 2010;48(7):599-606. doi:10.1016/j.brat.2010.03.011PubMedGoogle Scholar
    32.
    Catarino  A, Bateup  S, Tablan  V,  et al.  Demographic and clinical predictors of response to internet-enabled cognitive-behavioural therapy for depression and anxiety.  BJPsych Open. 2018;4(5):411-418. doi:10.1192/bjo.2018.57PubMedGoogle Scholar
    33.
    Waller  G.  Evidence-based treatment and therapist drift.  Behav Res Ther. 2009;47(2):119-127. doi:10.1016/j.brat.2008.10.018PubMedGoogle Scholar
    34.
    Shapiro  DA, Shapiro  D.  Meta-analysis of comparative therapy outcome studies: a replication and refinement.  Psychol Bull. 1982;92(3):581-604. doi:10.1037/0033-2909.92.3.581PubMedGoogle Scholar
    ×