The size of the dots reflect the study sample size.
eAppendix 1. Search string for PubMed
eAppendix 2. Algorithm for selecting one depression measure for each study
eAppendix 3. Flowchart on the selection of studies
eAppendix 4. Key characteristics of the included studies
eAppendix 5. References of the included studies
eAppendix 6. Description of sample of included studies
eAppendix 7. Sensitivity analyses: only low risk of bias
eAppendix 8. Sensitivity analyses: only CBT
eAppendix 9. Sensitivity analyses: specific target populations excluded
eAppendix 10. Sensitivity analyses: only diagnosed mood disorders
eAppendix 11. Sensitivity analyses using the multilevel meta-analysis model
eAppendix 12. Long-term effects across different age groups
eAppendix 13. Long-term effects of psychotherapies
Customize your JAMA Network experience by selecting one or more topics from the list below.
Identify all potential conflicts of interest that might be relevant to your comment.
Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.
Err on the side of full disclosure.
If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.
Not all submitted comments are published. Please see our commenting policy for details.
Cuijpers P, Karyotaki E, Eckshtain D, et al. Psychotherapy for Depression Across Different Age Groups: A Systematic Review and Meta-analysis. JAMA Psychiatry. 2020;77(7):694–702. doi:10.1001/jamapsychiatry.2020.0164
Do psychotherapies for depression have comparable outcomes in age groups across the life span?
In a meta-analysis of 366 randomized clinical trials including 36 072 patients comparing psychotherapy with control conditions, psychotherapies had lower effect sizes in children and adolescents compared with adults, and no significant differences were found between middle-aged and older adults. However, conclusions are not definitive, given the low quality of many studies, the risk of publication bias, and the high heterogeneity among the studies.
There is a need to improve psychotherapies in children and adolescents.
It is not clear whether psychotherapies for depression have comparable effects across the life span. Finding out is important from a clinical and scientific perspective.
To compare the effects of psychotherapies for depression between different age groups.
Four major bibliographic databases (PubMed, PsychINFO, Embase, and Cochrane) were searched for trials comparing psychotherapy with control conditions up to January 2019.
Randomized trials comparing psychotherapies for depression with control conditions in all age groups were included.
Data Extraction and Synthesis
Effect sizes (Hedges g) were calculated for all comparisons and pooled with random-effects models. Differences in effects between age groups were examined with mixed-effects subgroup analyses and in meta-regression analyses.
Main Outcomes and Measures
Depressive symptoms were the primary outcome.
After removing duplicates, 16 756 records were screened and 2608 full-text articles were screened. Of these, 366 trials (36 702 patients) with 453 comparisons between a therapy and a control condition were included in the qualitative analysis, including 13 (3.6%) in children (13 years and younger), 24 (6.6%) in adolescents (≥13 to 18 years), 19 (5.2%) in young adults (≥18 to 24 years), 242 (66.1%) in middle-aged adults (≥24 to 55 years), 58 (15.8%) in older adults (≥55 to 75 years), and 10 (2.7%) in older old adults (75 years and older). The overall effect size of all comparisons across all age groups was g = 0.75 (95% CI, 0.67-0.82), with very high heterogeneity (I2 = 80%; 95% CI: 78-82). Mean effect sizes for depressive symptoms in children (g = 0.35; 95% CI, 0.15-0.55) and adolescents (g = 0.55; 95% CI, 0.34-0.75) were significantly lower than those in middle-aged adults (g = 0.77; 95% CI, 0.67-0.87). The effect sizes in young adults (g = 0.98; 95% CI, 0.79-1.16) were significantly larger than those in middle-aged adults. No significant difference was found between older adults (g = 0.66; 95% CI, 0.51-0.82) and those in older old adults (g = 0.97; 95% CI, 0.42-1.52). The outcomes should be considered with caution because of the suboptimal quality of most of the studies and the high levels of heterogeneity. However, most primary findings proved robust across sensitivity analyses, addressing risk of bias, target populations included, type of therapy, diagnosis of mood disorder, and method of data analysis.
Conclusions and Relevance
Trials included in this meta-analysis reported effect sizes of psychotherapies that were smaller in children than in adults, probably also smaller in adolescents, that the effects may be somewhat larger in young adults, and without meaningful differences between middle-aged adults, older adults, and older old adults.
Hundreds of randomized clinical trials (RCTs) have examined the effects of psychological treatments for depression. However, most of this research has been conducted separately in children and adolescents,1-3 in younger4 and middle-aged adults,5-8 and in older adults.9-12 Therefore, it is not known whether therapies have comparable effects across different age groups. Although some research has focused on differential effects of therapies in younger and older adults,13 to our knowledge, there is no meta-analysis focusing on psychotherapies across the age range from children and adolescents to younger and older adults.
It is important to study effects of therapies across age groups for several reasons. First, differences identified between age groups can inform clinicians about the potential of treatments across age groups and may help inform treatment selection. Second, differential effects may also indicate differences between age groups in what procedures are required for symptom reduction, in the working mechanisms of the therapies, and even in the psychological processes involved in depression.
Quiz Ref IDWe conducted a systematic review and meta-analysis of psychotherapies to examine whether the reported outcomes differ by age group. In the past 15 years, we have conducted a series of meta-analyses on psychotherapies for children and adolescents on the one hand,1,2,14,15 and adults and older adults on the other hand.7-9,16 The child and adolescent meta-analyses have shown relatively modest mean effect sizes, whereas the adult and older adult meta-analyses on depression treatment have shown more substantial effect sizes. However, such differences can only be fairly tested when the same methods of analysis are applied consistently across age groups. By combining our databases and methods, we are able to examine the reported outcomes of psychotherapies for depression across the life span and explore potential associations.
We used 2 existing databases of studies on the psychotherapies for depression, 1 in adults and 1 in youths. The database in adults was used as the starting point of this meta-analysis because it is the larger one, and the studies identified through the second database were added to it. It has been described in detail elsewhere.17 Four bibliographic databases were searched (PubMed, PsycINFO, Embase, and Cochrane) by combining terms indicative of depression and psychotherapies, with filters for RCTs. The full search string for PubMed is given in eAppendix 1 in the Supplement. For this meta-analysis, we also checked the references of studies that were excluded from the adult database because of their focus on youths. The database is continuously updated (last update: January 1, 2019). All records were screened independently by 2 researchers, and all articles that met initial inclusion criteria were retrieved as full text. The decision to include or exclude a study was also made independently by the 2 researchers, and disagreements were resolved through discussion.
The second database focused on studies of psychotherapies for depression in children and adolescents. The searches were conducted in PubMed and PsycINFO.1 The searches were conducted up to December 2017 and updated to January 1, 2019, through the searches for the adult database.
We included studies that were: (1) RCTs; (2) in which a psychological treatment; (3) for depression in any age group; and (4) was compared with a control group (waiting list, usual care, pill placebo, or other control condition such as a brochure or a general discussion group). Depression could be established with a diagnostic interview (any of the versions of the DSM or International Statistical Classification of Diseases and Related Health Problems or the Research Diagnostic Criteria) or with a score greater than a cutoff on a standardized measure. We allowed different treatment formats, including individual, group, telephone, and guided self-help (through the internet or not).16 We included studies on treatment-resistant and chronic depression because the patients in these studies do meet criteria for depression.18 We excluded studies on bipolar and psychotic depression, on self-guided therapies without any professional support, on inpatients, and maintenance studies.
In line with previous meta-analyses,4,5,8,11 we assessed the risk of bias of included studies using 4 criteria of the risk of bias assessment tool, developed by the Cochrane Collaboration19: adequate generation of allocation sequence; concealment of allocation to conditions; masking of assessors; and dealing with incomplete outcome data (this was assessed as positive when intention-to-treat analyses were conducted). All 4 items were rated as positive (the criterion was met) or negative (the criterion was not met or it was unclear). The total risk of bias score for each study was calculated as the sum of all positive scores (range 0 to 4, with 4 indicating no risk of bias). Because it is not clear whether the use of self-report measures should be rated as positive or negative,20 we used 2 ways of assessing the total risk of bias score, one in which the use of self-report was rated as positive and one in which it was rated negative. Assessment of risk of bias was conducted independently by 2 researchers, and disagreements were resolved through discussion.
We examined age of the participants in different ways. First, we extracted the mean age of the sample. Second, we categorized the studies into 6 specific age categories: (1) children (mean age ≤13 years); (2) adolescents (13-18 years); (3) young adults (18-24 years); (4) middle-aged adults (24-55 years); (5) older adults (55-75 years); and (6) older old adults (75 years). Third, we clustered these 6 categories into 3 main age categories: youths (children and adolescents); early-middle adults (young adults and middle-aged adults); and late adults (older adults and older old adults). The other characteristics of the participants, interventions, and the study are presented in Table 1 (definitions and categories can be found elsewhere).17
Quiz Ref IDFor each comparison between a psychotherapy and a control condition, the effect size indicating the difference in depression between the 2 groups at posttest was calculated (Hedges g).21 Depressive symptoms were the primary outcome, and we used all measures examining depressive symptoms. Effect sizes were calculated by subtracting (at posttest) the mean score of the psychotherapy group from the mean score of the control group, divided by the pooled standard deviation at posttest, and corrected for small sample bias.21 If means and standard deviations were not reported, we used the procedures of the Comprehensive Meta-Analysis (CMA) software, version 3.3070, to calculate the effect size using dichotomous outcomes, or other statistics (such as t value or P value).
If 1 study included more than 1 depression measure, we pooled the effect sizes within the study before pooling the effect sizes across studies (assuming a correlation of 1.0 between measures).22 We conducted sensitivity analyses in which we included only 1 outcome measure for each study (the algorithm for selecting measures is given in eAppendix 2 in the Supplement). All effect sizes were calculated in CMA.
We pooled the effect sizes using the “meta” and “metafor” packages in R, and ran all analyses in R studio, version 1.1.463, for Mac (the R Foundation). We used a random-effects pooling model in all analyses. We pooled the effect sizes using the inverse variance method, with the Hartung-Knapp adjustment for the random-effects model. We calculated the I2 statistic and its 95% confidence interval to estimate heterogeneity.23 We examined the risk of publication bias through the Egger test24 and through the Duval and Tweedie trim-and-fill procedure,25 which yields an estimate of the effect size after the publication bias has been taken into account.
We conducted subgroup analyses according to the mixed-effects model, in which effect sizes within subgroups are pooled according to the random-effects model and differences between subgroups are tested with a fixed-effects model. We conducted bivariate meta-regression analyses (with mean age as the variable associated with the effect size) and multivariate meta-regression analyses to examine the association between the mean age of the study participants and the effect size, while adjusting for other characteristics of the studies. We conducted a series of sensitivity analyses in which we included only studies with low risk of bias, only studies examining cognitive behavioral therapy, only studies with patients meeting diagnostic criteria for a mood disorder, and one in which specific target groups were excluded. In addition, we conducted a sensitivity analysis that used the multilevel meta-analysis model, which accounts for the dependency within studies.26 We also calculated the relative risk (RR) of dropping out from the study (for any reason) for all comparisons. All tests were 2-sided, and P values were considered significant when they were smaller than .05.
We examined 16 756 records, retrieved 2608 full-text articles, and excluded 2242. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flowchart describing the inclusion process, with reasons for exclusion, is presented in eAppendix 3 in the Supplement. Three hundred sixty-six RCTs (with 453 comparisons between a psychotherapy and a control group) met inclusion criteria. In these trials, 36 702 patients participated (19 544 in the treatment and 17 158 in the control conditions). Key characteristics of the included studies are given in eAppendix 4 and references in eAppendix 5 in the Supplement.
Of the 366 studies, 13 (3.6%) focused on children, 24 (6.6%) on adolescents, 19 (5.2%) on young adults (mostly college students), 242 (66.1%) on middle-aged adults, 58 (15.8%) on older adults, and 10 (2.7%) on older old adults. Main characteristics of the studies are presented in eAppendix 6 in the Supplement, separately for youths, early-middle adults, and late adults. A more detailed description can be found in eAppendix 7 in the Supplement.
We examined whether the studies differed for key characteristics across age groups and found that the main age groups did indeed differ significantly on recruitment method, whether participants met criteria for a depressive disorder, and country where the study was conducted (eAppendix 6 in the Supplement). Regarding the intervention characteristics, we found that treatment format and number of treatment sessions differed significantly across age groups.
The risk of bias in most of the included studies was considerable. One hundred ninety-nine of 366 studies (54.4%) reported an adequate sequence generation. One hundred sixty-four studies (44.8%) reported allocation to conditions by an independent (third) party. Most studies used self-report measures (including parent-report measures of youth symptoms) as outcome (n = 240 [65.6%]), some conducted masked outcome assessments (n = 97 [26.5%]), and the remainder did not mask outcome assessors or did not clarify whether they were masked (n = 29 [7.9%]). In 220 studies (60.1%), intention-to-treat analyses were conducted. Only 106 studies (29%) met all quality criteria. Another 161 studies (44.0%) met 2 or 3 criteria, and the 99 remaining studies (27.0%) met no or only 1 criterion. Only 21 studies (5.7%) met all quality criteria when self-report measures were rated as negative for blinded outcomes assessments.
The overall effect size of all comparisons across all age groups was g = 0.75 (95% CI, 0.67-0.82), with very high heterogeneity (I2 = 80%; 95% CI: 78-82). The results are presented in Table 1.
We further examined these main results in several ways (Table 1). We examined the associations of multiple comparisons in 1 trial, we excluded outliers, we limited the studies to those with low risk of bias, we conducted the analyses using an alternative approach to select outcome measures (eAppendix 2 in the Supplement), and we examined publication bias. The studies with low risk of bias resulted in a considerably smaller effect size (g = 0.51; 95% CI, 0.42-0.60), and heterogeneity was still high (I2 = 76%; 95% CI, 72-80). Adjusting for publication bias resulted in an effect size of g = 0.43 (95% CI, 0.33-0.52; I2 = 80%; 95% CI, 85-87), with 126 imputed studies.
Type of control group was associated with effect size (Q2 = 16.66; P for difference <.001). Therefore, we also examined the differences between the main age categories according to type of control group (Table 2). We did not find significant differences for the effect sizes across the main age categories within the studies using a waiting-list control group, but we did find highly significant differences when usual care or other control groups were used. In studies with usual care and other control groups, smaller effects were found for children and adolescents in most analyses.
Quiz Ref IDWe conducted subgroup analyses to examine whether there were differences between the effect sizes found across the 6 specific age groups (Table 2). The effect size difference between the age groups was highly significant (Q5 = 29.57; P < .001). The effect size for children was the lowest (g = 0.35), and the effect size for adolescents (g = 0.55) was also relatively low. We found no indication that the effect sizes found for older adults (g = 0.66) and older old adults (g = 0.97) differed considerably from those for middle-aged adults (g = 0.77). The effect sizes across the different age groups are graphically presented in Figure 1.
In pairwise comparisons, the difference between children and adolescents was not significant (Q1 = 2.10; P = .14), but the differences between children and all other age categories were significant (Table 2). Effect sizes in adolescents differed significantly from young adults (Q1 = 10.35; P = .001) and middle-aged adults (Q1 = 3.96; P = .05), but there was no significant difference with older adults (Q1 = 0.85; P = .36) or older old adults (Q1 = 2.61; P = .11). Young adults differed from middle-aged adults (Q1 = 3.93; P = .05) and older adults (Q1 = 6.99; P = .01), but not from the older old adults (Q1 = 0.00; P = .99). Middle-aged adults did not differ significantly from older adults (Q1 = 1.31; P = .25) and older old adults (Q1 = 0.65; P = .42), and older adults did not differ significantly from older old adults (Q1 = 1.46; P = .23).
When we collapsed the age groups in 3 main age categories (youths, early-middle adults, and late adults), the difference between the 3 groups was still significant, with the smallest effect sizes found for youths (g = 0.48) and no clear difference between early-middle adults and late adults. Pairwise comparisons for the 3 main age categories indicated a significant difference between youths and early-middle adults (Q1 = 12.21; P < .001) and between youths and older adults (Q1 = 4.43; P = .04) but not between early-middle adults and late adults (Q1 = 0.96; P = .33).
We examined the association between the effect sizes and the mean age of the study population in a series of meta-regression analyses. First, we conducted bivariate meta-regression analyses, with the effect size as the dependent variable and the mean age as the only predictor. The mean age of the population was not found to be a significant predictor (coefficient [SE], 0.0001 [0.002]; P = .97). The bubble plot is given in Figure 2.
Then we conducted multivariate meta-regression analyses with the mean age as predictor as well as the characteristics of the studies as predictors (Table 3). Mean age was still not significantly associated with the effect size. To avoid overfitting the meta-regression models, we repeated these meta-regression analyses with a (manual) stepwise backward elimination of the least significant predictor until only significant predictors remained in the model. As can be seen in Table 3 (parsimonious models), mean age was still not significantly associated with the effect size.
We also conducted multivariate meta-regression analyses with the 6 specific age categories (adults were used as the reference category; Table 3). Studies with children were significantly less effective than studies with middle-aged adults, and that was true in the full model (coefficient [SE], −0.58 [0.23]; P = .01) and the parsimonious model (coefficient [SE], −0.49 [0.21]; P = .02). When we examined the 3 main age categories, studies in youths also resulted in significantly lower effect sizes in both in the full (coefficient [SE], −0.26 [0.13]; P = .01) and the parsimonious model (coefficient [SE], −0.38 [0.15]; P = .04).
In the sensitivity analyses (eAppendices 8-12 in the Supplement), the subgroup analyses indicating significant differences between the age groups (3 and 6 categories) were all significant or there was a trend for significance. In most analyses, the results were very similar to the main analyses. The sensitivity analyses using the multilevel meta-analysis model (eAppendix 12 in the Supplement) resulted in comparable outcomes and supported the main findings of this study.
Study dropout was significantly higher in the psychotherapy conditions compared with the control conditions (RR, 1.15; 95% CI, 1.06-1.24; I2 = 40%; 95% CI, 32-48), but there were no significant differences between the 6 specific age categories nor for the 3 main age categories.
Quiz Ref IDWe conducted separate analyses for longer-term outcomes between 6 to 8 months after baseline, 9 to 12 months after baseline, 13 to 24 months after baseline, and more than 24 months after baseline (eAppendix 13 in the Supplement). In none of these analyses were the effects of therapies for children and adolescents significantly different from zero. Between 9 and 12 months, the differences between the main age categories were significant (with smaller effect sizes for children and adolescents; P = .03) but not for other follow-up periods.
To our knowledge, this is the first meta-analysis of psychotherapy trials for depression in all age groups and the largest meta-analysis of depression psychotherapies ever conducted. We found that the effect sizes for therapies in children and adolescents were significantly smaller than those found in adults. Effect sizes were especially small in children 13 years and younger, but effect sizes for adolescents were also significantly smaller than those for adults. These differences were not supported in all meta-regression analyses in which we adjusted for other characteristics of the studies, which may indicate that the significant associations may be partly explained by other characteristics of the studies.
It is not clear how to explain the differences between children/adolescents and adults. Possibly, therapies are less effective among youths. However, these differences may also be explained by differences in study or intervention characteristics. For example, we found no significant difference between age groups when waiting lists were used as a control condition; this is important because waiting list is arguably the only control condition that is identical across all age groups. Several differences between therapies should also be noted, such as the involvement of parents in some therapies for children and adolescents and specific therapies for specific age groups, such as life review therapy for older adults. Another possibility is that the therapies most often used with children and adolescents are primarily age-adapted versions of therapies originally designed for adults. Those therapies may be a better fit to the needs of adults than those of young people. Another possibility is that young people’s potential for recovery from depression is constrained by parental and family characteristics that youths, unlike adults, have little opportunity to escape or alter.10,11 Whatever the reason, this meta-analysis brings to light potential limitations of currently available psychological treatments for depression in children and adolescents, the need for improvement of treatments for these age groups, and the need for more outcome research.
One remarkable finding was that almost all research on children and adolescents was conducted in the United States, while research in early-middle and older adults was very comparable between North America and Europe. It is difficult to understand what the causes of this difference are.
In this meta-analysis, we also found that the effect sizes of therapies were somewhat larger in young adults (≥18 to 24 years) compared with middle-aged adults (≥24 to 55 years). One possible reason is that many studies of young adults were conducted with college students, who may have more learning capacity and may be easier to recruit. A second reason relates to differences in the characteristics of the studies. Another option is that age may not be associated with the effects of treatment in a linear way, as is assumed by the meta-regression analyses, but that there is a curvilinear association. The data suggest that the effect sizes are small in childhood, become larger in adolescents, and grow further in early adulthood, before decreasing to more modest effect sizes in the rest of the adult populations. We did not test this because of the high heterogeneity and the fact that most studies in young adults were with college students, which may not be representative for the whole age group. However, this is certainly a possibility that should be further explored in future studies.
Quiz Ref IDThe results of this study have to be considered with caution because of several important limitations. First, although there was a large number of trials, the quality of many trials was low, and when we limited the main analyses to studies with the highest quality, the differences between the age groups were no longer significant. Second, we did not rate the number of disagreements between raters of characteristics of the studies or calculate agreement between raters. This was related to the large number of included studies and the fact that the ratings were performed over several years while building this database. Third, heterogeneity was very high in most analyses, and subgroup analyses could not explain these high levels. This suggests that the effects differed considerably across studies, and it is not clear what caused these differences. Fourth, the depression measures differed considerably across studies, with no measures that were used across all age groups. This not only contributed to the heterogeneity of the studies but also made it impossible to examine the influence of baseline severity on the effect sizes. Fifth, most studies were conducted with cognitive behavioral therapy, and the number of studies with other therapies was very limited. Sixth, because of the small number of studies, the long-term outcomes cannot be considered definitive.
Despite the limitations of this meta-analysis, we can cautiously conclude that the effect sizes of studies of psychotherapy for depression are smaller in children than in adults, probably also smaller in adolescents, and may be somewhat larger in young adults, and that there are probably no meaningful differences between middle-aged adults, older adults, and older old adults.
Corresponding Author: Pim Cuijpers, PhD, Department of Clinical, Neuro and Developmental Psychology, Amsterdam Public Health Research Institute, Vrije Universiteit Amsterdam, Van der Boechorststraat 7-9, Amsterdam, Noord-Holland 1081 BT, the Netherlands (firstname.lastname@example.org).
Accepted for Publication: January 16, 2020.
Published Online: March 18, 2020. doi:10.1001/jamapsychiatry.2020.0164
Author Contributions: Dr Cuijpers had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Cuijpers, Eckshtain, Quero Castellano, Weisz.
Acquisition, analysis, or interpretation of data: Cuijpers, Karyotaki, Eckshtain, Ng, Corteselli, Noma, Weisz.
Drafting of the manuscript: Cuijpers, Eckshtain, Weisz.
Critical revision of the manuscript for important intellectual content: Karyotaki, Eckshtain, Ng, Corteselli, Noma, Quero Castellano, Weisz.
Statistical analysis: Cuijpers, Karyotaki, Noma.
Administrative, technical, or material support: Eckshtain, Corteselli, Weisz.
Conflict of Interest Disclosures: Dr Noma reported personal fees from Boehringer Ingelheim, Kyowa Hakko Kirin, and ASKA Pharmaceutical outside of the submitted work. Dr Cuijpers received expense allowances for his membership of the Board of Directions of “Mind.nl,” for being Chair of the Research committee of the Dutch Council for military care and research, and for being Chair of the Mental Health Priority Area of the Wellcome Trust in London, England, in 2018. In addition, he received royalties for books he has authored or coauthored and for occasional workshops and invited addresses. Dr Weisz received payments for consulting with the Child Health and Development Institute and the National Institute of Mental Health, royalties for books he has authored and coauthored, and honoraria for workshops and invited presentations at professional meetings and conferences. No other disclosures were reported.
Create a personal account or sign in to: