Flow of patients through the study and numbers becoming depressed. CIDI indicates Composite International Diagnostic Interview; DNA, did not attend; N/D, not depressed; and D, depressed.
Plots of mean predicted probability against observed probability of depression within deciles of predicted risk. A, Model fitted in the European data using unshrunk coefficients. B-D, European model fitted in the Chilean data using shrunk coefficients. Risk scores are based on the average of the 6 European country coefficients (the UK coefficient is 0) (B) and the coefficients for Spain and Slovenia, respectively (C and D). Each point on the graphs represents a decile of risk.
Examples of a range of predicted probabilities of depression at baseline. Mean (SD) Short Form 12 (SF-12) mental and physical subscale scores for Europe were 48.5 (10.6) and 44.2 (11.0), respectively. High scores indicate good health/well-being. Scores in parentheses correspond to eliminating discrimination and work difficulties and correcting SF-12 physical and mental health scores to the European mean (see text).
King M, Walker C, Levy G, Bottomley C, Royston P, Weich S, Bellón-Saameño JÁ, Moreno B, Švab I, Rotar D, Rifel J, Maaroos H, Aluoja A, Kalda R, Neeleman J, Geerlings MI, Xavier M, Carraça I, Gonçalves-Pereira M, Vicente B, Saldivia S, Melipillan R, Torres-Gonzalez F, Nazareth I. Development and Validation of an International Risk Prediction Algorithm for Episodes of Major Depression in General Practice AttendeesThe PredictD Study. Arch Gen Psychiatry. 2008;65(12):1368-1376. doi:10.1001/archpsyc.65.12.1368
Strategies for prevention of depression are hindered by lack of evidence about the combined predictive effect of known risk factors.
To develop a risk algorithm for onset of major depression.
Cohort of adult general practice attendees followed up at 6 and 12 months. We measured 39 known risk factors to construct a risk model for onset of major depression using stepwise logistic regression. We corrected the model for overfitting and tested it in an external population.
General practices in 6 European countries and in Chile.
In Europe and Chile, 10 045 attendees were recruited April 2003 to February 2005. The algorithm was developed in 5216 European attendees who were not depressed at recruitment and had follow-up data on depression status. It was tested in 1732 patients in Chile who were not depressed at recruitment.
Main Outcome Measure
DSM-IV major depression.
Sixty-six percent of people approached participated, of whom 89.5% participated again at 6 months and 85.9%, at 12 months. Nine of the 10 factors in the risk algorithm were age, sex, educational level achieved, results of lifetime screen for depression, family history of psychological difficulties, physical health and mental health subscale scores on the Short Form 12, unsupported difficulties in paid or unpaid work, and experiences of discrimination. Country was the tenth factor. The algorithm's average C index across countries was 0.790 (95% confidence interval [CI], 0.767-0.813). Effect size for difference in predicted log odds of depression between European attendees who became depressed and those who did not was 1.28 (95% CI, 1.17-1.40). Application of the algorithm in Chilean attendees resulted in a C index of 0.710 (95% CI, 0.670-0.749).
This first risk algorithm for onset of major depression functions as well as similar risk algorithms for cardiovascular events and may be useful in prevention of depression.
Reducing the prevalence of depression is a public health challenge for the 21st century. Depression occurs in up to a quarter of general practice attendees,1 relapse 10 years from first presentation is frequent,2 and both residual disability and premature mortality are common.3 Low socioeconomic status4,5 and female sex6 are the 2 most consistently identified risk factors. Socioeconomic risk factors include low income and financial strain,4 unemployment,4 work stress,7 social isolation,8 and poor housing.5 Other factors, such as family history of depression, play a part.9 Additional risk factors identified in adult general practice populations are negative life events, poor physical health, poor marital or other interpersonal relationships, a partner or spouse's poor health, and problems with alcohol.10 Poor social support, loneliness, and physical disability appear to be particular risks for older adults.11- 13 Estimating overall risk across a range of likely risk factors is essential in efforts to prevent depression. However, effective strategies for prevention are hindered by lack of evidence about the combined effect of known risk factors. Our objectives were to develop a risk algorithm for the onset of major depression in European general practice attendees and test its predictive power in a non-European setting. We modeled our approach on risk indexes for cardiovascular disease,14 which provide a percentage risk estimate over a given period.
We undertook a prospective study to develop a quantitative risk prediction algorithm for the onset of major depression over 12 months in general practice attendees. Given the relapsing and remitting nature of major depression, 12 months was considered a useful period for prediction in this setting. The method, described in detail elsewhere,15 was approved by ethical committees in each country. The study was conducted in 6 European centers: (1) 25 general practices in the Medical Research Council General Practice Research Framework in the United Kingdom; (2) 9 large primary care centers in Andalucía, Spain; (3) 74 general practices nationwide in Slovenia; (4) 23 general practices nationwide in Estonia; (5) 7 large general practice centers near Utrecht, the Netherlands; and 6) 2 large primary care centers in the Lisbon area of Portugal. We assessed the external validity of the risk algorithm in patients attending 78 general practices in Concepción and Talcahuano in the Eighth Region of Chile. General practices covered urban and rural populations with considerable socioeconomic variation.
Consecutive attendees aged 18 to 75 years were recruited in Europe between April 2003 and September 2004 and in Chile between October 2003 and February 2005. Exclusion criteria were an inability to understand one of the main languages involved, psychosis, dementia, and incapacitating physical illness. Recruitment differed slightly in each country because of local service preferences. In the United Kingdom and the Netherlands, researchers spoke to patients waiting to see practice staff. In the remaining European countries, physicians introduced the study before contact with researchers. In Chile, attendees were stratified on age and sex according to figures for the populations served by each health center and participants selected randomly within each stratum. Participants gave informed consent and undertook a research evaluation within 2 weeks.
A DSM-IV diagnosis of major depression in the preceding 6 months was made using the depression section of the Composite International Diagnostic Interview (CIDI).16,17 We selected risk factors to cover all important areas identified in a systematic review of the literature.18 Where possible, we used standardized self-report measures. Questions adapted from standardized questionnaires or developed for the study were evaluated for test-retest reliability in 285 general practice attendees evenly recruited across the European countries before the main study began.15 Each instrument or question not available in the relevant languages was translated from English and back-translated by professional translators.15 The 39 candidate risk factors are numbered, and those subjected to test-retest reliability are italicized.
(1) Age, (2) sex, (3) occupation, (4) educational level, (5) marital status, (6) employment status, (7) ethnicity, (8) owner-occupier accommodation, (9) living alone or with others, (10) born in country of residence or abroad, (11) satisfaction with living conditions, and (12) long-standing physical illness.
(13) Lifetime depression was based on affirmative answers to both of the first 2 questions of the CIDI depression section.19
Stress in paid and unpaid work in the preceding 6 months using questions from the job content instrument.20Participants were categorized as feeling in control in (14) paid or (15) unpaid work; (16) experiencing difficulties without support in paid or unpaid work; and (17) experiencing distress without feeling respect for their paid or unpaid work.
(18) Financial strain using a question used in UK government social surveys.4
Self-rated (19) physical and (20) mental health were assessed by the Short Form 12.21 The weights used to calculate scores are from version 1.
(21) Alcohol use in the preceding 6 months using the Alcohol Use Disorders Identification Test.22 (22) We asked whether participants had ever had an alcohol problem or treatment for same.
(23) Whether participants had ever used recreational drugs using adapted sections of the CIDI.
Questions on the quality of (24) sexual and (25) emotional relationships with partners or spouses.23
(26) Presence of serious physical, psychological, or substance misuse problems, or any serious disability, in people who were in close relationship to participants.
(27) Difficulties in getting on with people and maintaining close relationships.24
Childhood experiences of (28) physical and/or emotional and (29) sexual abuse.25
(30) Holding religious and/or spiritual beliefs.26
(31) History of serious psychological problems or (32) suicide in first-degree relatives.27
(33) Anxiety and (34) panic symptoms in the previous 6 months using relevant sections of the Patient Health Questionnaire (PHQ).28
(35) Satisfaction with the neighborhood and (36) perceived safety inside/outside of the home using questions from the Health Surveys for England.29
(37) Major life events in the preceding 6 months using the List of Threatening Life Experiences Questionnaire.30
(38) Experiences of discrimination in the preceding 6 months on grounds of sex, age, ethnicity, appearance, disability, or sexual orientation using questions from a European study.31
(39) Adequacy of social support from family and friends.32
All participants were reevaluated for DSM-IV major depression, our main outcome, after 6 and 12 months using the depression section of the CIDI.
All analyses and data imputation were performed using Stata release 9.33 We included only patients without major depression at baseline. Participants with missing depression diagnoses at any point were excluded as this outcome was central to our risk estimation.
Missing data in candidate risk factors were imputed using the method of chained equations, implemented in the Stata command ice.34 We imputed 10 data sets35 and obtained combined estimates.36
We built a risk model using the 39 risk factors described earlier and country of residence of each participant. We developed this model in the imputed data using stepwise logistic regression with robust standard errors to adjust for general practice clustering. We used a conservative threshold for inclusion of P < .01 to produce a stable model and minimize the degree of overfitting. We retained age and sex in all regression models because of their well-known associations with onset of depression.37,38 We also retained country because of an a priori assumption of clustering within country. Multivariable fractional polynomial analysis was used to assess possible nonlinear effects of continuous predictors. The resulting risk score provides a predicted probability of depression over 12 months.
We calculated the C index39 to estimate the discriminative power of the final model in each European country and all European countries combined. We used a calculation proposed by Copas40 to adjust for overfitting of our prediction model. This involves computing a shrinkage factor that is applied to the model coefficients to provide more accurate predictions when the risk algorithm is applied in new settings. To deal with the overfitting that arises through variable selection, we computed the shrinkage factor based on the initial model including all 39 variables. We assessed the goodness of fit of the final risk model by grouping individuals into deciles of risk and comparing the observed probability of major depression within these groups with the average risk. We calculated effect sizes using the Hedges g41 for the difference in log odds of predicted probability between patients who were later observed to be depressed and those who were not. Finally, we report the threshold values of risk score, and the associated sensitivity, for a range of specificity that would be practical (minimizing false positives) when using the instrument in a clinical setting
We used the C index, Hedges g, and a comparison of predicted vs observed probability of depression to evaluate the performance of the predictD model in the Chilean data.
In the 7 countries, 10 045 people took part (Figure 1). Response to recruitment was high in Portugal (76%), Estonia (80%), Slovenia (80%), and Chile (97%) but lower in the United Kingdom (44%) and the Netherlands (45%). Ethical considerations prevented the collection of data on nonresponders at baseline. Across all countries, the response to follow-up was at 89.5% at 6 months and 85.9% at 12 months. Women predominated in each country and prevalence of major depression at baseline was 13.9% in women and 8.5% in men. Seven thousand two hundred nine European participants had full CIDI data at recruitment to allow a depression diagnosis, of whom 6190 were not depressed at recruitment (Table 1). Of these, 5216 (84.3%) had full CIDI data for a depression diagnosis at 6 and 12 months' follow-up, and of these, 3972 (76.2%) also had full data on all 39 risk factors. Cumulative 12 months' incidence of DSM-IV major depression in the European population was 7.7% (United Kingdom, 8.8%; Spain, 15.1%; Slovenia, 4.2%; Portugal, 8.5%; the Netherlands, 5.4%; and Estonia, 5.9%). Missing information was less than 3% for 38 of the 39 risk factors; however, 12.6% of participants had missing data on their emotional relationship with a spouse or partner (risk factor 25 in the “Major Depression and Known Risk Factors” subsection).
In our reliability study prior to recruiting the cohort, all risk factors tested (except discrimination on skin color) produced κ coefficients of 0.59 to 1.00 and percentage of agreement of 67% to 100%. The κ coefficient for agreement on discrimination due to skin color was low because of the small number of nonwhite participants.15
The risk algorithm was developed on the 5216 European attendees who were not depressed at recruitment and who had data on our main outcome, DSM-IV major depression at 6 and 12 months. Nonlinear transformations of continuous variables did not significantly improve the model fit. Seven variables were retained at P < .01 and these were included with country, age, and sex in the regression model (Table 2). Five variables in the final model concerned past events or patient characteristics (sex, age, education, results of lifetime depression screen, family history of psychological difficulties); 4, current status (Short Form 12 physical health subscale score, Short Form 12 mental health subscale score, unsupported difficulties in paid and/or unpaid work, and discrimination) (Table 2); and 1 concerned country. Examination of the risk model derived in each of the 10 imputed data sets revealed that it was stable in terms of the variables selected. Besides country, age, and sex, 5 variables (results of lifetime depression screen, family history of psychological difficulties, Short Form 12 physical health subscale score, Short Form 12 mental health subscale score, and unsupported difficulties in paid and/or unpaid work) were consistently selected in each of the imputed data sets. Discrimination was selected in 7 data sets and education, in 4 data sets. Three other variables that did not reach the full model were also selected in a number of imputed data sets. These were PHQ panic syndrome (6 sets), childhood sexual abuse (1 set), and PHQ anxiety syndrome (1 set).
We compared a model with interactions between sex and the remaining risk factors to the model with no interactions. A Wald test provided no evidence to suggest that including interaction terms improves the model fit (P value = .27; χ216 = 19.06). There was also no evidence for including interactions with age (P value = .21; χ216 = 20.19).
The average C index across countries for predicted probability of depression at 6 or 12 months in all 6 European countries was 0.790 (Table 3). The model was most predictive in the Netherlands (0.852) and least predictive in Portugal (0.747). The effect size for the difference in log odds of predicted probability between attendees in Europe who subsequently became depressed and those who did not was 1.28 (95% confidence interval [CI], 1.17-1.40) (Table 4). Again, the model discriminated best in the Netherlands (1.55) and least well in Portugal (0.99). To examine the fit of the model, we divided the European sample into deciles of predicted probability of depression on the predictD score. Within each decile, we plotted mean predicted probability vs observed probability of depression (Figure 2A). Figure 2A shows that the incidence of major depression in the highest decile of risk score in Europe was more than 30% in contrast to the overall incidence of 7.7%. Examples of the kinds of participants scoring at increasing levels of predicted probability of depression on the predictD score algorithm are shown in Figure 3. To demonstrate the potential impact of mutable factors on risk, scores in the last 3 examples in Figure 3 were recalculated after mutable risk factors were reduced or eliminated. Estimates of sensitivity and specificity of the risk score in predicting major depression over 12 months are shown in Table 5. Questions in the risk algorithm can be tested at http://www.techflora.com/ucl and a risk score obtained. (The questions, responses, coding, and algorithm for the predictD risk tool are available on request.)
Cumulative 12-month incidence of major depression in Chilean general practice attendees was 11.6%. There were no missing data in Chile on any of the 10 risk factors of the final European model. The model was validated using data on 1732 attendees who were not depressed at recruitment (Figure 1). The Copas shrinkage factor for the European model was 0.866, suggesting a degree of overfitting. We evaluated the prediction algorithm's external validity in the Chilean data using the shrunk regression coefficients derived in the European data and comparing predicted with observed probability. Because country is included in the model, it was necessary to base risk scores in Chile on an assumed country effect. Using the coefficient for Spain gave the best concordance between predicted and observed probability of major depression in Chile (Figure 2C and D) and reflects the prevalence of depression in Chile being more similar to Spain than Slovenia. The C index for the risk algorithm in Chile was 0.710 (95% CI, 0.670-0.749). This lower degree of discrimination can also be seen in the estimates of specificity and sensitivity in Chile (Table 5).
We have developed a risk score from recognized risk factors for major depression over 12 months in 5216 general practice attendees in Europe and validated its use in 1732 attendees in Chile. To our knowledge, this is the first risk algorithm to be developed simultaneously in a number of cultures in one continent for prediction of new episodes of major depression in a general medical setting and validated in another continent. This is arguably the most rigorous test that can be applied to a prediction tool. We emphasize that our study was not about recognition of current depression, nor was it about a search for new risk factors; these are well known. Nor was it about developing a prognostic tool for outcome of depression, which has been achieved recently.42 Our aim was to determine the key factors in a valid clinical prediction algorithm. Five risk factors are immutable (age, sex, educational level achieved, results of lifetime screen for depression, and family history of depression) and 4 are mutable factors relating to current status (Short Form 12 physical health and mental health subscale scores, unsupported difficulties in paid and/or unpaid work, and experiences of discrimination). The C index provides a standardized way of comparing the discriminative power of tests that use different measurement units in different settings.43 The predictD risk score compares favorably with a risk index for cardiovascular events developed in 12 European cohorts44 that reported C indexes between 0.71 and 0.82.
Our calculation of a shrinkage factor provides a measure of overfitting in the European data and allows for its adjustment in predicting risk of depression in new settings. External validation and shrinkage for overfitting are often not undertaken.45,46 When the algorithm is applied in a country outside of the 6 participating European countries, we recommend that either the average country coefficient be used (Figure 2) or the coefficient for the European country that most closely matches the annual incidence of depression (if known) in the new setting.
Despite the advantages of a cross-national study and an external population in which to validate the risk algorithm, there are limitations to our study. Lower recruitment rates occurred in the United Kingdom and the Netherlands, possibly because the study was not so obviously endorsed by physicians. However, response to follow-up in all countries was high. There were differences in the geographical distribution of general practices in each country, which reflected the varying networks available to the centers. Follow-up was relatively short but in keeping with what would be acceptable for prediction of depression in general practice. People from nonwhite ethnic minorities were relatively underrepresented. Although our risk factors are based on self-report, we used standardized instruments, and nonstandardized questions were tested for reliability. Our data imputation retained power and reduced bias. Although 24% of European participants had missing data on at least 1 risk factor, as we reported, missing data were less than 3.0% on 38 of the 39 factors. Finally, we stress that our study did not aim to provide insights into pathways to depression. Rather, we aimed to develop a predictive tool for the detection of DSM-IV major depressive disorders prior to onset. Such an instrument could then be used for prevention of depression in a manner similar to an existing instrument used in cardiovascular prevention in family practice settings.14 Some of our risk factors in the predictD algorithm may be mediators on the pathway to depression. For example, childhood experiences of emotional abuse may make depression at an early age more likely, but once it has occurred, this will show up most parsimoniously in the algorithm as lifetime history.
Our study does not address how the risk algorithm for depression might best be implemented in general practice. However, the questions making up the algorithm are brief and easy to complete, and thus it has potential as a clinical tool for prediction of future episodes of depression in this setting (http:www.techflora.com/ucl). Our results expressed by the C index and effect sizes demonstrate a clear difference in risk between participants who became depressed and those who did not do so. In suggesting useful thresholds of sensitivity and specificity (Table 5), we have erred on the side of maximizing specificity at the cost of reduced sensitivity to minimize the workload for family physicians engaging with false positives. We would recommend setting specificity at 80% to 85% (risk score, ≥10.6%) to contain the workload of the physician, albeit at the cost of missing a proportion of future major depressive episodes.
Patients identified as being at risk on screening can be flagged on practice computers to alert physicians when they consult. Recognition of those at risk may be helpful when it leads to watchful waiting or active support, such as restarting treatment in patients with a history of depression. Advising patients on the nature of depression or on brief cognitive behavior strategies they might undertake to reduce their risk could also be envisaged. The application of such strategies to the prevention of depression in primary care would benefit from further evaluation. Four of the 10 factors were open to intervention/change and the impact of such change is shown in Figure 3. Efforts to reduce the incidence of depression might usefully address these factors through a combination of physical, psychological, and medical interventions. However, this implies that the risk model has a causal interpretation, something that our study cannot demonstrate. It also does not mean that when immutable factors predominate in any particular individual there can be no recourse to prevention. The introduction of brief cognitive behavior skills might be a preventive strategy regardless of the risk factors implicated. The same is true for starting or restarting antidepressant medication use.
This risk algorithm for major depression compares favorably with risk algorithms for prediction of cardiovascular events and may be useful in prevention of depression in general medical settings.
Correspondence: Michael King, MD, PhD, Department of Mental Health Sciences, University College London Medical School, Royal Free Campus, Rowland Hill Street, London NW3 2PF, England (firstname.lastname@example.org).
Submitted for Publication: December 14, 2007; final revision received April 1, 2008; accepted May 12, 2008.
Author Contributions: Dr King had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Financial Disclosure: None reported.
Funding/Support: The research in Europe was funded by a grant from the European Commission, reference PREDICT-QL4-CT2002-00683. Funding in Chile was provided by project FONDEF DO2I-1140. Partial support in Europe was from the Estonian Scientific Foundation (grant 5696), the Slovenian Ministry for Research (grant 4369-1027), the Spanish Ministry of Health (grant field-initiated studies program references PI041980, PI041771, and PI042450), the Spanish Network of Primary Care Research (redIAPP) (ISCIII-RETIC RD06/0018), and SAMSERAP group. The UK National Health Service Research and Development office provided service support costs in the United Kingdom.
Disclaimer: The funders had no direct role in the design or conduct of the study, interpretation of the data, or review of the manuscript.
Additional Contributions: The European Office at University College London provided administrative assistance at the coordinating centre and Kevin McCarthy, project scientific officer, European Commission, Brussels, Belgium, provided helpful support and guidance. We thank all patients and general practice staff who took part; the UK Medical Research Council General Practice Research Framework (MRC GPRF); Louise Letley, MSc, from the MRC GPRF; the general practitioners of the Utrecht General Practitioners' Network; and the Camden and Islington Mental Health and Social Care Trust.