HAM-D indicates Hamilton Rating Scale for Depression32,33; MDD, major depressive disorder; QIDS, Quick Inventory for Depressive Symptomatology27; and SCID, Structured Clinical Interview for DSM-IV Axis I Disorders.28
aPatients could be excluded for more than 1 reason.
Customize your JAMA Network experience by selecting one or more topics from the list below.
Connolly Gibbons MB, Gallop R, Thompson D, et al. Comparative Effectiveness of Cognitive Therapy and Dynamic Psychotherapy for Major Depressive Disorder in a Community Mental Health Setting: A Randomized Clinical Noninferiority Trial. JAMA Psychiatry. 2016;73(9):904–912. doi:10.1001/jamapsychiatry.2016.1720
Is short-term dynamic psychotherapy not inferior to cognitive therapy in the treatment of major depressive disorder (MDD) in the community mental health setting?
In this randomized noninferiority trial that included 237 adults, short-term dynamic psychotherapy was statistically significantly noninferior to cognitive therapy in decreasing depressive symptoms among patients receiving services for MDD in the community mental health setting.
Short-term dynamic psychotherapy and cognitive therapy may be effective in treating MDD in the community.
Dynamic psychotherapy (DT) is widely practiced in the community, but few trials have established its effectiveness for specific mental health disorders relative to control conditions or other evidence-based psychotherapies.
To determine whether DT is not inferior to cognitive therapy (CT) in the treatment of major depressive disorder (MDD) in a community mental health setting.
Design, Setting, and Participants
From October 28, 2010, to July 2, 2014, outpatients with MDD were randomized to treatment delivered by trained therapists. Twenty therapists employed at a community mental health center in Pennsylvania were trained by experts in CT or DT. A total of 237 adult outpatients with MDD seeking services at this site were randomized to 16 sessions of DT or CT delivered across 5 months. Final assessment was completed on December 9, 2014, and data were analyzed from December 10, 2014, to January 14, 2016.
Short-term DT or CT.
Main Outcomes and Measures
Expert blind evaluations with the 17-item Hamilton Rating Scale for Depression.
Among the 237 patients (59 men [24.9%]; 178 women [75.1%]; mean [SD] age, 36.2 [12.1] years) treated by 20 therapists (19 women and 1 man; mean [SD] age, 40.0 [14.6] years), 118 were randomized to DT and 119 to CT. A mean (SD) difference between treatments was found in the change on the Hamilton Rating Scale for Depression of 0.86 (7.73) scale points (95% CI, −0.70 to 2.42; Cohen d, 0.11), indicating that DT was statistically not inferior to CT. A statistically significant main effect was found for time (F1,198 = 75.92; P = .001). No statistically significant differences were found between treatments on patient ratings of treatment credibility. Dynamic psychotherapy and CT were discriminated from each other on competence in supportive techniques (t120 = 2.48; P = .02), competence in expressive techniques (t120 = 4.78; P = .001), adherence to CT techniques (t115 = −7.07; P = .001), and competence in CT (t115 = −7.07; P = .001).
Conclusions and Relevance
This study suggests that DT is not inferior to CT on change in depression for the treatment of MDD in a community mental health setting. The 95% CI suggests that the effects of DT are equivalent to those of CT.
clinicaltrials.gov Identifier: NCT01207271
The effectiveness of cognitive therapy (CT)1 for major depressive disorder (MDD) has been established in controlled efficacy trials2-5 and real-world effectiveness trials.6,7 However, substantial debate is ongoing as to whether short-term dynamic psychotherapy (DT), which targets an individual’s impairing relationship conflicts, has the research base to support its dissemination as an intervention for MDD. Although DT has been and is currently practiced worldwide,8,9 the research literature across mental disorders is flooded with reviews debating whether DT has adequate evidence of effectiveness.10-20 Despite the myriad of reviews debating this issue, few trials of DT specifically for MDD have met the strict design criteria detailed by Chambless and Hollon,21 and, to our knowledge, few attempts have been made to compare DT and CT directly.
Two trials involving the treatment of MDD22,23 have demonstrated that DT plus medication is superior to medication alone, and a pilot study24 demonstrated that DT was superior to treatment as usual for treating MDD in a community mental health setting. The largest study to date25 demonstrated that DT was statistically significantly noninferior to CT for treating MDD in an outpatient setting in the Netherlands.
We present herein the results of a randomized clinical noninferiority trial that directly compared CT with DT for treating MDD in a community mental health setting. Our study builds on the previous noninferiority trial25 by including a broad assessment of functioning and quality of life, blind independent ratings of treatment fidelity, and a community mental health sample. Details of the protocol are published.26 We developed and implemented our trial with a focus on internal validity, including (1) expert, individual, and group supervision; (2) blind fidelity ratings; and (3) blind expert assessments of the primary symptom outcome.
The primary hypothesis for this trial was that DT would not be inferior to CT for treating MDD in a community mental health setting as measured by blind expert ratings of depressive symptoms. Our secondary hypothesis was that DT would not be inferior to CT across broader assessments of symptoms, functioning, and quality of life.
This trial was conducted in collaboration with NHS Human Services, a private, nonprofit organization that provides mental health services across 7 Mid-Atlantic states, primarily to publicly funded consumers. The present study took place at a single outpatient community mental health center (CMHC) providing services to approximately 4900 individuals per year. Recruitment occurred from October 28, 2010, through July 2, 2014, and final assessment was completed on December 9, 2014. Patients in the study received gift cards worth $25 to $50 for each assessment and clinicians earned $300 for workshop attendance, $25 for supervision sessions, and $150 honorariums for every 2 patients treated.26 Study procedures were conducted in compliance with the institutional review board of the University of Pennsylvania, which approved this study. Participants provided written informed consent. The full trial protocol is available in the Supplement.
Patients were recruited from those seeking services for depression at the CMHC. The Quick Inventory for Depressive Symptomatology (QIDS)27 was completed by all adult patients attending an intake assessment. Patients aged 18 to 65 years who scored at least 11 on the QIDS underwent screening by telephone, and potentially eligible patients were scheduled for a baseline assessment. A research clinical evaluator conducted the Structured Clinical Interview for the DSM-IV Axis I Disorders.28 Patients who met criteria for MDD were included in the study if they did not have (1) a diagnosis of bipolar disorder; (2) current or past diagnosis of schizophrenia, psychosis, MDD with psychotic features, or seizure disorder; (3) depression due to an organic disease; (4) substance or alcohol abuse requiring immediate referral to substance abuse treatment; (5) referral to a partial hospitalization program; or (6) suicidal thoughts judged by the clinic to require more intensive psychotherapy.
Clinicians employed by NHS Human Services were recruited through advertisement. All clinicians had a master’s degree or above. Clinicians were matched to treatment based on previous training and education, theoretical orientation, and desire to be trained in a given treatment.
The DT consisted of supportive-expressive DT.29,30 The treatment includes techniques to build a positive working alliance and expressive techniques to help patients gain self-understanding of their repetitive maladaptive relationship patterns. The treatment actively explores current relationship conflicts and includes socialization to treatment and focus on interpersonal goals.
Standard CT1,31 consisting of structured sessions focusing on behavioral activation and the exploration of depressogenic beliefs was implemented. Interventions included activity scheduling, evaluating automatic thoughts, and behavioral experiments.
The training and supervision was provided by expert supervisors with substantial experience delivering and supervising the respective treatments. The DT supervisor (K.C.C.) had 20 years of clinical experience and the CT supervisor (J.J.) had 14 years at the time of study initiation. A training workshop was followed by intensive individual supervision across the first 3 training cases. Ongoing bimonthly group supervision was provided to clinicians across training and randomization phases. Supervisors listened to digital recordings of sessions to prepare for the individual and group supervision sessions.
Nine CT clinicians and 11 DT clinicians completed the workshop and training and treated at least 1 randomized patient. Further details on training of therapists are provided by Connolly Gibbons et al.26
All assessments were administered at baseline and months 1, 2, 4, and 5 at the CMHC. The Hamilton Rating Scale for Depression (HAM-D)32,33 was used to evaluate the severity of depression. Trained clinical evaluators administered the Structured Clinical Interview for the DSM-IV Axis I Disorders and HAM-D. These evaluators were not affiliated with the clinical site and were blind to treatment condition and study hypotheses. A diagnostic supervisor provided written feedback based on a random review of 10% of audiotaped interviews and conducted a monthly group conference call to maintain reliability. Secondary outcomes included the 24-item Behavior and Symptom Identification Scale (BASIS-24),34 the Quality of Life Inventory (QOLI),35 and the Medical Outcomes Study 36-item Short Form (SF-36)36 (mental and physical component scores).37,38
The Opinions About Treatment measure39 was administered after session 2 to assess patients’ perceptions of treatment credibility. Patients were informed at randomization of their treatment assignment but were not informed of the study hypotheses. Demographic characteristics, including race and ethnicity (assessed for descriptive purposes), were self-reported by patients at baseline into categories defined before the study by investigators.
Eligible patients were randomized in a parallel design with a 1:1 allocation to 16 sessions of DT or CT delivered across 5 months using a computer-generated urn randomization algorithm40-42 based on 7 pretreatment factors, including sex,43,44 long-term relationship status,43,45,46 minority status,44,47,48 expectations of improvement,49,50 depression severity,43,45,46,49,51 use of psychotropic medications,52,53 and recurrence of depression.49,54 The study statistician (R.G.), who had no contact with study participants, generated the random assignment and conveyed the assignment to the study research assistant at baseline. The research assistant then scheduled the patient for the first study therapy appointment. All patients were invited to complete assessments regardless of the number of treatment sessions attended. All appointments took place at the CMHC.
Measures of fidelity to DT were rated on 1 early session (usually session 3) of each DT case and in a random sample of 19 CT cases. Measures of fidelity to CT were rated on 1 early session of each CT case and in a sample of 20 DT cases. To benchmark our results, we rated the CT fidelity measures on 1 early session from 15 randomly selected cases that participated in an efficacy trial of CT.2
For DT, we used a community adaptation of the Penn Adherence/Competence Scale for Supportive Expressive Dynamic Psychotherapy.24,55 Adherence to CT was assessed using the CT subscale of the Collaborative Study Psychotherapy Rating Scale56; the Cognitive Therapy Scale57 was used to assess competence. A separate pool of 4 advanced graduate student judges was used to rate fidelity to each treatment in a balanced incomplete block design. All judges were blind to the research design, settings, and interventions used in each sample.
Data were analyzed from December 10, 2014, to January 14, 2016. We conducted hierarchical linear models comparing slopes across treatment groups and including all observed data across the monthly assessments. Time was defined as the log of the number of weeks from the baseline assessment. All patients randomized to treatment were included in the analyses regardless of the number of treatment or assessment sessions attended. The hierarchical linear models included random intercept and random slope terms, with an autoregressive structure used to model the residual errors. In the hierarchical linear models analysis, those with only a baseline value contribute to the estimate of the intercept as well as the variance component attributable to the random intercept. Our primary outcome was the model-based change from baseline to end point on the 17-item HAM-D total score. We selected an a priori noninferiority margin of a difference of 2.5 points on change in the HAM-D as the smallest clinically relevant change recommended by Montgomery58 and previously implemented by Szegedi and colleagues.59 We followed Hirotsu’s unifying approach60 to include a test of noninferiority followed by a subsequent test for treatment superiority only in the case for which noninferiority is not obtained. For this multiple decision process, the α level was set a priori at .025 to account for the 2 decisions. The noninferiority of the secondary measures was evaluated using an a priori defined margin of Cohen d effect size of 0.29, which represents a small to moderate effect.
Power calculations used the formula of Julious61 to guarantee a power of 80% for assessing noninferiority and superiority while accommodating the repeated-measures design.62,63 Included in the formula were the noninferiority bound of 2.5 HAM-D points defined a priori, a pooled SD set at 8.5, α set at .025, an attrition rate of 10%, repeated assessments, and an estimated within-subject correlation of 0.40. Sample size was determined to be 230 subjects.
Recruitment occurred from October 28, 2010, through July 2, 2014.The QIDS was completed by 3951 outpatients at treatment intake (Figure). The clinic intake worker excluded 851 patients based on diagnosis of psychotic disorder or immediate referral to a more intensive treatment program. Of 1110 individuals screened by telephone, 529 (47.7%) were excluded for lack of interest, failure of telephone screen criteria, inability to contact, or nonattendance at the baseline assessment. Five hundred eighty-one baseline assessments were conducted, and 237 patients (40.8%) were randomized to treatment. Of the 118 patients randomized to DT, 103 (87.3%) attended at least 1 treatment session and 104 (88.1%) received at least 1 postbaseline assessment. Of the 119 patients randomized to CT, 99 (83.2%) attended at least 1 treatment session and 105 (88.2%) received at least 1 postbaseline assessment.
Baseline demographic characteristics are presented in Table 1. Fifty-nine patients (24.9%) were men; 178 (75.1%) were women (mean [SD] age, 36.2 [12.1] years). Most of the patients were single and not employed full-time and had a high school diploma or less. One hundred sixteen patients (48.9%) were members of a minority group. We found no statistically significant differences between treatment groups on any of the baseline demographic variables (all P > .08). Sixty-three patients (26.6%) attended 1 or fewer sessions of psychotherapy; 122 (51.5%), 5 or fewer sessions; and 187 (78.9%), 11 or fewer sessions. We found no statistically significant difference between treatments in the number of sessions attended (t235 = 1.47; P = .14). Representative of the CMHC setting, 210 patients (88.6%) had a concurrent Axis I diagnosis; 166 (70.0%), a concurrent anxiety diagnosis; and 133 (56.1%), a concurrent alcohol or substance use diagnosis. Baseline demographics for clinicians are presented in Table 2. Of the 20 clinicians, 19 were women and 12 were white, with a mean (SD) age of 40.0 (14.6) years.
Patients rated both treatments with high credibility. Two-tailed tests for paired samples indicated no differences between treatments on ratings of treatment sensibility (t162 = 0.19; P = .85), confidence in treatment (t162 = −1.14; P = .26), or confidence recommending the treatment (t162 = −0.86; P = .39).
For the primary outcome measure, we found a mean (SD) difference in change on the HAM-D of 0.86 (7.73) scale points (Cohen d = 0.11) between CT and DT. With α set at .025, the upper bound of the 95% CI for this value is 2.42 HAM-D points. The 95% CI upper bound of 2.42 is less than our a priori noninferiority margin of 2.5 points, indicating that change in depressive symptoms for the DT group is statistically not inferior to the amount of change in depressive symptoms observed in the CT condition (Table 3). Evaluation of the 95% CI suggests that DT is equivalent to CT on change in depression. We found a statistically significant main effect for time (F1,198 = 75.92; P = .001; Cohen d = 0.55 within DT and Cohen d = 0.65 within CT on change from baseline to end point). We found no statistically significant interaction between the use of psychotropic medication and treatment group on the rate of change in the HAM-D (F1,209 = 0.12; P = .73).
Despite small observed effect size differences between the treatments, we cannot conclude that DT was statistically noninferior to CT on change on the BASIS-24 (Cohen d = 0.14; 95% CI upper bound, 0.35), the QOLI total score (Cohen d = 0.22; 95% CI upper bound, 0.43), or the SF-36 Mental Component score (MCS) (Cohen d = 0.15; 95% CI upper bound, 0.36) (Table 3). We found a statistically significant main effect for time on the BASIS-24 (F1,192 = 133.32; P = .001), the QOLI (F1,188 = 44.55; P = .001), and the SF-36 MCS (F1,205 = 60.52; P = .001). Superiority of CT over DT was not demonstrated for change on the BASIS-24 (F1,192 = 1.07; P = .30), the QOLI (F1,188 = 4.18; P = .04), or the SF-36 MCS (F1,205 = 0.049; P = .48). Dynamic psychotherapy was significantly noninferior to CT on the SF-36 Physical Component score (PCS) (Cohen d = −0.07; 95% CI upper bound, 0.14; P = .03); however, both treatments demonstrated significant (but slight) deterioration across time (F1,207 = 5.19; P = .02). Nineteen patients (16.1%) in DT and 26 patients (21.8%) in the CT condition demonstrated response to treatment as measured by a 50% reduction on the HAM-D score across treatment (χ21 = 1.27; P = .32).
Adherence to supportive techniques was not rated significantly higher in DT compared with CT (t120 = −0.38; P = .70) (Table 4). However, competence in the use of psychodynamic supportive techniques (t120 = 2.48; P = .02) and adherence (t120 = 3.89; P = .001) and competence to expressive techniques (t120 = 4.78; P = .001) were rated significantly higher in DT compared with CT. Adherence to CT techniques (t115 = −7.07; P = .001) and CT concrete techniques64(t115 = −7.04; P = .001) were rated significantly higher in CT compared with DT, but neither adherence to CT techniques (t110 = −0.55; P = .58) nor adherence to CT concrete techniques (t110 = −1.42; P = .16) were rated significantly different from the CT efficacy sample. Competence in CT techniques was rated as significantly higher in CT compared with DT (t115 = −7.07; P = .001), but was not statistically significantly different from the CT efficacy sample (t110 = −1.21; P = .23).
Five of the 118 patients randomized to DT and 10 of the 119 patients randomized to CT experienced at least 1 serious adverse event (χ21 = 1.73; P = .19). Most serious adverse events included nonpsychiatric hospitalizations. None were judged to be related to study procedures or intervention.
The trial results indicate that short-term DT is not inferior to CT in decreasing depressive symptoms among patients receiving services for MDD in the community mental health setting. This investigation adds to the emerging literature of randomized clinical trials22-25 indicating that short-term DT is another efficacious intervention, in addition to CT, for treating MDD. We were able to discriminate DT and CT from each other on adherence and competence ratings with large effect sizes. Adherence and competence ratings for our CMHC CT group were not significantly different from those observed with expert therapists in efficacy trials.
Our secondary analyses examining noninferiority of DT compared with CT on measures of self-reported depression, functioning, and quality of life were largely inconclusive. Across these 4 secondary measures, the mean effect size for differences between treatment conditions on change across treatment was 0.11, indicating that no clinically meaningful advantage existed for CT. When this protocol was designed, we had no data to base the setting of the noninferiority margin for these secondary measures, and statistical power was only set for testing noninferiority on the primary outcome measure. Our obtained data demonstrated large variation on these measures in this setting, resulting in the 95% CIs for extremely small observed effects extending beyond the 0.29 margin.
Limitations of this study include the lack of a control condition, missing monthly outcome data for some patients, and a lack of a follow-up assessment. Further, these results may generalize only to the community mental health setting. Our results do, however, replicate a large randomized clinical noninferiority trial conducted in a general outpatient setting,25 suggesting that DT is not inferior to CT among patients receiving services in a broad range of community settings. These treatments were not delivered with the same intensity as in efficacy trials; however, our results represent the comparative effectiveness of these treatments in a real-world setting. We could not consider the therapist as an additional level in our hierarchical linear models structure because of a limited number of repeated assessments. The model that included therapist as a random effect yielded a variance estimate of 0 where statistical significance could not be evaluated because of nonconvergence. Finally, the efficacy sample used to benchmark the fidelity ratings was not a randomized sample.
Our investigation indicates that when intensive expert supervision is used in community mental health settings, DT is not inferior to CT on change in depression for the treatment of MDD. Both treatments were delivered in this community mental health setting with high fidelity and could be discriminated from one another.
Correction: This article was corrected on November 30, 2016, to fix the spelling of an author’s first name.
Corresponding Author: Mary Beth Connolly Gibbons, PhD, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, 3535 Market St, Room 649, Philadelphia, PA 19104 (firstname.lastname@example.org).
Accepted for Publication: June 12, 2016.
Published Online: August 3, 2016. doi:10.1001/jamapsychiatry.2016.1720
Author Contributions: Dr Connolly Gibbons had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Connolly Gibbons, Thompson, K. Crits-Christoph, Jacobs, P. Crits-Christoph.
Acquisition, analysis, or interpretation of data: Connolly Gibbons, Gallop, Luther, K. Crits-Christoph, Jacobs, Yin, P. Crits-Christoph.
Drafting of the manuscript: Connolly Gibbons, Gallop, Thompson, Yin, P. Crits-Christoph.
Critical revision of the manuscript for important intellectual content: Connolly Gibbons, Thompson, Luther, K. Crits-Christoph, Jacobs, Yin, P. Crits-Christoph.
Statistical analysis: Connolly Gibbons, Gallop, P. Crits-Christoph.
Obtaining funding: Thompson.
Administrative, technical, or material support: Connolly Gibbons, Thompson, Luther, K. Crits-Christoph, Jacobs, Yin, P. Crits-Christoph.
Study supervision: Connolly Gibbons, Thompson, K. Crits-Christoph, Jacobs.
Conflict of Interest Disclosures: None reported.
Funding/Support: This study was supported by award R01HS018440 from the Agency for Healthcare Research and Quality (Dr Connolly Gibbons).
Role of the Funder/Sponsor: The funding source had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Disclaimer: The content is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality.
Previous Presentation: These data were presented in part at the 46th Annual Meeting of the Society for Psychotherapy Research; June 25, 2015; Philadelphia, Pennsylvania.