Summary correlations between duration of untreated psychosis (DUP) and outcomes by follow-up point. CI indicates confidence interval. The squares are roughly proportioned to the amount of data available.
Odds of no remission in the long vs short duration of untreated psychosis (DUP) groups. An odds ratio greater than 1 indicates that individuals in the long DUP group were more likely not to be in remission at the follow-up point. CI indicates confidence interval; PANSS, Positive and Negative Syndrome Scale; GAF, Global Assessment of Functioning; WHO, World Health Organization; and SAPS, Scale for the Assessment of Positive Symptoms. Squares indicate the size of the contribution to the study of the summary odds ratio (diamonds).
Marshall M, Lewis S, Lockwood A, Drake R, Jones P, Croudace T. Association Between Duration of Untreated Psychosis and Outcome in Cohorts of First-Episode PatientsA Systematic Review. Arch Gen Psychiatry. 2005;62(9):975-983. doi:10.1001/archpsyc.62.9.975
Copyright 2005 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2005
Duration of untreated psychosis (DUP) is the time from manifestation of the first psychotic symptom to initiation of adequate treatment. It has been postulated that a longer DUP leads to a poorer prognosis. If so, outcome might be improved through earlier detection and treatment.
To establish whether DUP is associated with prognosis and to determine whether any association is explained by confounding with premorbid adjustment.
The CINAHL (Cumulative Index to Nursing and Allied Health), EMBASE, MEDLINE, and PsychLIT databases were searched from their inception dates to May 2004.
Eligible studies reported the relationship between DUP and outcome in prospective cohorts recruited during their first episode of psychosis. Twenty-six eligible studies involving 4490 participants were identified from 11 458 abstracts, each screened by 2 reviewers.
Data were extracted independently and were checked by double entry. Sensitivity analyses were conducted excluding studies that had follow-up rates of less than 80%, included affective psychoses, or did not use a standardized assessment of DUP.
Independent meta-analyses were conducted of correlational data and of data derived from comparisons of long and short DUP groups. Most data were correlational, and these showed a significant association between DUP and several outcomes at 6 and 12 months (including total symptoms, depression/anxiety, negative symptoms, overall functioning, positive symptoms, and social functioning). Long vs short DUP data showed an association between longer DUP and worse outcome at 6 months in terms of total symptoms, overall functioning, positive symptoms, and quality of life. Patients with a long DUP were significantly less likely to achieve remission. The observed association between DUP and outcome was not explained by premorbid adjustment.
There is convincing evidence of a modest association between DUP and outcome, which supports the case for clinical trials that examine the effect of reducing DUP.
Duration of untreated psychosis (DUP) is defined as the time from manifestation of the first psychotic symptom to initiation of adequate antipsychotic drug treatment. It is to be distinguished from duration of untreated illness, which has the same end point but begins with the emergence of the first symptom.1 It has been postulated that untreated psychosis has a toxic effect through some unknown neurologic or psychological mechanism so that patients with a longer DUP have a poorer prognosis.2 If this proposition is correct, then reducing DUP through earlier detection and treatment should improve outcome.3 The postulated benefit of reducing DUP has been 1 of several arguments used to justify the establishment of early intervention services in the United States, Canada, Australia, and several European countries.4 For example, in England, under the National Health Service Plan, 50 early intervention teams have been established at a cost of £70 million.5
However, despite the rush to establish early intervention services, the existence of an association between DUP and outcome, whether causal or not, remains to be convincingly demonstrated. For example, a recent nonsystematic review6 concluded that the evidence for an association was conflicting, another1 that the evidence was convincing only for positive symptoms, and a third7 that the evidence showed that DUP had a considerable effect across a range of outcomes. These differing conclusions reflect the complex nature of the available evidence, which includes retrospective studies from before and after the introduction of neuroleptic agents; several small, placebo-controlled trials of neuroleptic drug treatment or neuroleptic drug withdrawal; and follow-up studies.8
Follow-up studies of first-episode cohorts are, at present, the most satisfactory source of evidence for or against the postulated link between DUP and outcome, for 3 reasons. First, follow-up studies of first-episode cohorts are amenable to meta-analysis because they are fairly numerous and of similar design. Second, first-episode follow-up studies are likely to provide the best estimates of DUP because they collect this information at first presentation. Third, first-episode studies are not biased toward patients who experience multiple hospital admissions.9 The main aim of this review, therefore, is to apply systematic techniques of data ascertainment, quality assessment, data extraction, and synthesis to first-episode cohort studies to establish whether they showed evidence of an association between DUP and outcome.
It has been suggested that premorbid adjustment is the most likely confounder of any association between DUP and outcome on the grounds that people with poor premorbid adjustment are not only less likely to seek psychiatric care but also more likely to have a poor prognosis.10 The secondary aim of this review, therefore, is to determine the extent to which premorbid adjustment explained any identified association between DUP and outcome.
This review aimed to identify all prospective cohort studies available for review by May 2004 that had examined the relationship between DUP and outcome in patients with their first episode of psychosis. Unlike randomized controlled trials, first-episode cohort studies are not well indexed; therefore, a search strategy was generated empirically by examining the indexing of potentially eligible studies in the personal databases of the reviewers. This search strategy (available on the study Web site at http://www.lantern-centre.org.uk/dup), which aimed at sensitivity rather than specificity, was then applied to the following databases: CINAHL (Cumulative Index to Nursing and Allied Health) (January 1982 to May 2004), the Cochrane Schizophrenia Group Register (Issue 1, 2004), EMBASE (January 1980 to May 2004), MEDLINE (January 1966 to May 2004), and PsychLIT (January 1967 to May 2004). The sensitivity of the search was examined by scrutinizing the reference lists of the relevant studies and reviews detected by the search. The authors of included studies were contacted when further information was needed.
Only studies of first-episode cohorts were eligible. Studies were excluded if they were restricted to patients younger than 16 years or older than 60 years, had a follow-up rate of less than 50%, only correlated DUP with measures of brain structure or cognitive functioning, or reported duration of untreated illness rather than DUP. Each abstract was screened by 2 of 3 reviewers (M.M., A.L., and T.C.), and copies of potentially relevant articles were requested. The reference lists of all requested articles were scrutinized to ensure that no relevant studies had been overlooked. Requested articles were reviewed independently (by M.M. and A.L.), and a table of included studies was constructed. Any disagreements were resolved by discussion with a third reviewer (S.L.).
Direct measures of psychopathologic characteristics were selected as the primary outcome variables because of their presumed proximity to the core disease process in schizophrenia. These measures included positive symptoms, negative symptoms, symptoms of depression/anxiety, all symptoms (defined as the combined score for negative, positive, and neurotic symptoms), overall functioning (as defined by the composite level of functioning scores on the Global Assessment of Functioning scale,11 the Global Assessment Scale,12 or similar scales), and number achieving remission. Secondary outcome variables were time to remission, relapse (time to relapse and number relapsing), quality of life, and social functioning. These measures were considered secondary because they were more distant from the core disease process and hence were more likely to be affected by social or health service factors. “Disorganized” symptoms were also reported as a secondary outcome when they were reported separately from negative and positive symptoms. It was not possible to impose a common definition of remission; therefore, the definitions used in the original studies are reported.
Data on primary or secondary outcomes were excluded if they were collected using unpublished scales.13 Data were extracted independently by 2 raters (M.M. and A.L.) and were cross-checked by double entry into a customized database (Access; Microsoft Corp, Redmond, Wash). Any disagreements were resolved by discussion.
There are no widely agreed-on quality criteria for follow-up studies in general or for studies of DUP in particular. However, there were clear justifications for 4 quality criteria, which were recorded for each included study: restriction of participants to those with schizophrenia-like disorders on the basis of standardized diagnostic criteria because patients with affective psychoses are known to have a shorter DUP and a better prognosis1; outcome should be assessed blind to DUP status because such knowledge might bias raters’ assessments of outcome; achieving a follow-up rate of at least 80% (studies with follow-up rates <50% were not eligible); and use of a standardized method to determine DUP. The sensitivity analyses excluded studies that did not meet these criteria.
All data were entered into a computer program (Comprehensive Meta-Analysis; BioStat, Inc, Englewood, NJ), which was used to perform the computations.14 From the included studies we identified 4 types of data relating to DUP and outcome: correlations, mean differences between long and short DUP groups, number of events in long and short DUP groups, and time to events in long and short DUP groups. Correlational data were preferred to data based on mean differences when a study provided both types of data on the same outcome at the same time because there is no universal agreement on the cutoff point between “long” and “short” DUP and because dichotomizing a continuous variable reduces the resolution of the data.
Correlational data were synthesized into a single correlation coefficient (r) with 95% confidence intervals (CIs), calculated using the Fisher Z transformation.15 Correlation coefficients were combined irrespective of the method of calculation used in the original study (parametric, nonparametric, or parametric methods on log-transformed data). Because the distribution of DUP is invariably positively skewed, nonparametric or transformed data methods were the most appropriate methods,15 so a sensitivity analysis was performed that excluded parametric correlations based on untransformed data.
Comparisons between the long and short DUP groups based on continuous data were synthesized by calculating standardized mean differences (Cohen d) with 95% CIs. There is no agreed-on cutoff point between long and short DUP, so a range of cutoff points was adopted across the included studies. Consequently, on the relevant Forrest plots, studies are given in descending order from the largest to the smallest cutoff point to permit a visual assessment of any trends related to choice of cutoff point.
Data on the number of events in the long and short DUP groups were synthesized by calculating odds ratios with 95% CIs using the fixed-effects method. Occasionally, the relationship between DUP and outcome was presented in terms of time to events (remission or relapse); these data are presented in the text because insufficient details were available to permit meta-analysis.
Correlational data are given in a Forrest plot in which the summary correlation coefficient for each outcome and the associated 95% CI are plotted against a horizontal axis ranging from –1 to 1 (Figure 1). These correlational data were adjusted so that in all cases a result falling to the right of the line of no effect (arising vertically from a correlation of 0) indicated an association between longer DUP and poorer outcome. Data from long vs short DUP group comparisons were plotted in a similar manner except that the horizontal axis represented the standardized mean difference (available on the study Web site at http://www.lantern-centre.org.uk/dup). Categorical data on remission were plotted on an axis representing the log of the odds ratio. These data are presented in summary form (by follow-up point) and by individual study (Figure 2). In this case, the line of no effect was represented by an odds ratio of 1, and results falling to the right of that line indicated that patients with longer DUP were less likely to achieve remission. All comparisons were subject to a test of heterogeneity to determine whether there was greater variation between the results of the studies contributing to that comparison than would be expected by chance. When heterogeneity was significant, the data were reanalyzed using a random-effects model. The presence of heterogeneity suggests systematic differences between studies that might be related to either the types of participants or the methods used.
To assess the effect of premorbid adjustment as a confounding variable, we identified all included studies that had examined the effect of premorbid adjustment on a statistically significant association between DUP and 1 of the included outcome variables. We then assessed the quality of these analyses according to 2 criteria. The first criterion was adjustment for multiple testing because the commonest scale for measuring premorbid adjustment provides 4 different summary scores (for childhood, early adolescence, late adolescence, and adulthood).16 The second criterion was that steps were taken to ensure that premorbid adjustment was assessed before onset of the psychotic phase of the disorder. These data are summarized in tabular form.
Of 11 458 abstracts (including poster presentations) identified by the search strategy, 619 referred to studies that were thought to potentially satisfy the inclusion criteria. After obtaining the full articles for these abstracts, 35 eligible studies were identified, which were described in a total of 172 publications (a study flow diagram is available on the study Web site at http://www.lantern-centre.org.uk/dup). However, 9 of these studies17- 25 did not provide quantitative data on the primary or secondary outcomes, so the final sample consisted of 26 studies26- 50 involving 4490 participants (Table) (the full reference list is available on request).
In the 26 studies providing data, the mean age of participants at presentation was 27.8 years, with women composing 39% of the sample. The mean DUP was 124 weeks, although this value decreased to 103 weeks after the exclusion of an extreme outlier.50 Twenty studies were restricted to participants with schizophrenia or schizophrenia-like disorders, 2 reported data separately for schizophrenia and all other psychoses, and 4 reported data for all psychoses only. Twelve studies reported the use of a systematic method to assess DUP.
Figure 1 displays summary correlations between DUP and primary or secondary outcomes at first presentation and at 6-, 12-, and 24-month follow-up. These data show a distinct temporal pattern in which correlations between DUP and outcome were small or nonsignificant at first presentation but became statistically significant for most outcomes by 6- and 12-month follow-up. Thus, at baseline, the only statistically significant correlations between DUP and outcome were for 1 primary outcome (depression/anxiety) and 1 secondary outcome (quality of life). However, by 6 months there were statistically significant correlations between DUP and all 5 primary outcomes and 1 secondary outcome (social functioning), and by 12 months there were statistically significant correlations between DUP and all outcomes for which data were available. In all cases, a longer DUP was associated with a worse outcome. By 24 months, the quantity of data were substantially reduced, being derived from only 2 studies39,47 and 232 patients. Nonetheless, there were still statistically significant correlations between longer DUP and worse outcome for overall functioning, positive symptoms, and quality of life but not for negative symptoms or social functioning.
Data based on comparisons between the long and short DUP groups are available on the study Web site at http://www.lantern-centre.org.uk/dup. These data, although based on smaller numbers of patients, showed a pattern similar to the correlational data. At first presentation there were statistically significant differences between the long and short DUP groups only on negative symptoms and quality of life. However, by 6 months there were statistically significant differences between the long and short DUP groups on all symptoms, overall functioning, positive symptoms, and quality of life (the long DUP group was worse in all cases) but not on depression/anxiety (for which data were limited to 19 patients) and negative symptoms. No data were available from long vs short DUP group comparisons at 12-month follow-up, but data were available at 24 months from 1 study46 and at 15 years from another study28 for 4 primary outcomes (depression/anxiety, negative symptoms, overall functioning, and positive symptoms). There were no statistically significant differences between the long and short DUP groups on any outcome at 24 months; however, at 15 years the long DUP group was significantly worse on depression/anxiety, overall functioning, and positive symptoms but not on negative symptoms.
A temporal pattern was also seen in the degree of heterogeneity between study estimates of effect size. Thus, at first presentation, statistically significant heterogeneity was present between studies that contributed correlational data on DUP and negative symptoms and on DUP and positive symptoms, but there was no significant heterogeneity at any other follow-up point. The presence of heterogeneity means that estimates of the strength of the correlation showed greater variation between studies than would be expected by chance, which implies that there were systematic differences in the study methods at first presentation. The implications of this finding are discussed in the “Comment” section.
Seven studies29,33,37,40,41,46,50 provided data on the number of patients in remission in the long and short DUP groups at 6-, 12-, 24-, and 269-month follow-up. Participants with long DUP were statistically significantly more likely not to achieve remission at all follow-up points (Figure 2). Tests of heterogeneity were not statistically significant despite variation in the definition of remission. Two studies35,49 provided data on length of DUP among participants in remission vs participants not in remission. These data showed that DUP was significantly longer in patients not in remission (n = 270; standardized difference, 0.517; 95% CI, 0.121-0.915; P = .01, heterogeneity not significant).35,49 Two studies32,43 provided data on time to remission, and both showed that it was longer among participants with long DUP. One study43 also showed that the likelihood of remission is reduced in patients with a DUP greater than 1 year, although risk of relapse is not increased.
Sixteen multiple regression analyses were identified (from 9 studies) that had examined the effect of adjusting for premorbid adjustment in the presence of a statistically significant association between DUP and 1 or more of the primary or secondary outcome variables (available on the study Web site at http://www.lantern-centre.org.uk/dup). In 4 of 16 analyses, a statistically significant association between DUP and outcome ceased to be significant after controlling for the effects of premorbid adjustment; however, 3 of these analyses were suboptimal according to the quality criteria given in the “Methods” section (2 failed to control for multiple testing, and all 3 failed to ensure that premorbid adjustment was assessed before onset of the disorder). In the remaining 12 analyses, the association between DUP and the outcome variable remained statistically significant despite, in most cases, not controlling for multiple testing. The relationship between DUP and positive symptoms seemed particularly robust, with 4 analyses failing to show any effect of premorbid adjustment.
Full details of the sensitivity analyses on the primary outcome variables at baseline and 6 and 12 months are available on the study Web site at http://www.lantern-centre.org.uk/dup. The sensitivity analyses excluded studies that included individuals with affective psychoses in their cohort, used the Pearson method without log transformation of the data (correlational data only), did not use a standardized method for assessing DUP, or had a follow-up rate of less than 80%. The sensitivity analyses did not substantially alter the findings for any of the main outcome variables. No sensitivity analysis was conducted excluding nonblinded studies because only 2 studies were blinded. The first blinded study39 reported statistically significant correlations between DUP and positive symptoms and quality of life at presentation and at 12- and 24-month follow-up but found no correlation with negative symptoms. The second blinded study43 found a statistically significant association between DUP and level of functioning, but a subsequent study, using hazard ratios, found that DUP was not a significant predictor of first relapse.
This systematic review demonstrates convincing evidence of a modest association between DUP and a broad range of outcomes and shows that the association is not obvious at first presentation (for outcomes other than quality of life) but rather emerges after treatment. The clearest evidence for the association was seen in the correlational data at 6- and 12-month follow-up, where only 2 of 15 comparisons did not reach statistical significance (with both negative comparisons being based on very small data sets). Evidence for the association was also seen in 4 of 6 comparisons at 6 months based on differences between the long and short DUP groups despite the smaller amount of data available for this type of analysis and the lower degree of resolution that it provides. The associations seen in these data were consistent with the review’s other finding that patients with longer DUP were less likely to achieve remission. Beyond 12 months, fewer data were available, and the evidence for a continuing association was less clear-cut, although there were suggestions that for some outcomes the association may endure for as long as 15 years after presentation. Whereas, on the basis of correlational data, the association between DUP and outcome was highly consistent, it was not particularly strong, accounting for at best 13% of the variance (in outcome for all symptoms at 6 months). On the other hand, long DUP seemed to account for approximately 1 in 3 to 1 in 4 of those who did not achieve remission (eg, number needed to treat at 6 months was 3.59; 95% CI, 2.55-6.42).
Despite the consistency of the association demonstrated by this meta-analysis, some of the included studies concluded that there was no association. In particular, 3 important US studies43,44,46 are sometimes cited as evidence of no association between DUP and outcome. Yet the results of all 3 studies are broadly consistent with the findings of this review. In the Iowa prospective study,44 a small sample size meant that although the correlations obtained for positive symptoms and for overall functioning were not significant within the study, their 95% CIs overlapped the estimate of the pooled correlation obtained by this review, and they might have also done so for negative symptoms if disorganized symptoms had not been reported separately. In the Hillside study,43 an initial report found a significant association between DUP and level of remission, but a subsequent study, using hazard ratios, found that DUP was not a significant predictor of first relapse. However, because of the large number of analyses conducted on the data set, this second study used 99% CIs to determine significance and as a result was probably somewhat underpowered. In the Suffolk County study,46 there were no significant differences between the long and short DUP groups at 24-month follow-up, but the study did find that fewer patients with long DUP were in remission at 24 months. Although this effect was not statistically significant within the study, the effect size is similar to that found at 24 months by the only other study37 that examined this outcome at the same follow-up point, and the cumulative results from the 2 studies were statistically significant (Figure 2).
Despite the consistency of its findings, this review has 2 key methodological limitations. The first is that only 2 studies used researchers who rated outcome blind to DUP status. Although both blinded studies found evidence of an association between outcome and DUP, we cannot exclude the possibility that in other studies the ratings of outcome may have been biased by raters’ previous knowledge of participants’ DUP. Any future follow-up studies of first-episode cohorts should ensure that they use raters who are blind to DUP status. The second limitation is that there were insufficient data to permit a formal analysis of publication bias for any individual outcome. However, the consistency of results across outcomes and methods of analysis and the inclusion of several large studies suggest that this is an unlikely explanation for the findings.
Two incidental findings of the review are of interest and might be related. First, in long vs short DUP group comparisons (see the study Web site at http://www.lantern-centre.org.uk/dup), there was no obvious relationship between the effect size and the cutoff point chosen to define the long and short DUP groups. Second, heterogeneity in effect size between studies was frequent at first presentation but absent at follow-up. These observations are compatible with a recently advanced hypothesis that the long-term harm caused by psychosis occurs principally in the first few months or even weeks after onset.27 This hypothesis would explain the first observation because only the choice of a cutoff point very close to the onset of psychosis would have any noticeable effect on the size of the difference in outcome between the long and short DUP groups. The hypothesis would also explain the second observation because it implies that people with short DUP, who tend to respond quickly to treatment, would also make the predominant contribution to any observed correlation between DUP and outcome. Hence, studies that performed their “baseline” assessment before treatment began would find no relationship between DUP and outcome, whereas those that delayed assessment until a few days after treatment commenced would find a substantial difference. The result would be substantial heterogeneity between studies at baseline, which would disappear at subsequent follow-up points, as observed in this review.
The presence of an association between DUP and outcome does not prove that untreated psychosis causes poor outcome. The association might be because outcome and DUP are correlated with a third variable. However, we found little evidence to support the hypothesis that this third variable is premorbid adjustment.
It is not possible to be certain how far, if at all, reducing DUP will improve outcome. However, it seems likely that efforts to directly manipulate DUP will substantially increase our understanding of the disease process in schizophrenia, even if they do not open up new therapeutic avenues. Research from Scandinavia has already demonstrated that a systematic program of early detection can shorten the DUP and lead to more patients receiving help at a less severe stage of their illness.51 The next challenge for early intervention services worldwide is to perform the large-scale clinical trials that will establish beyond a doubt whether shortening DUP improves prognosis.
Correspondence: Max Marshall, MD, The Lantern Centre, Vicarage Lane, Off Watling Street Road, Fulwood, Preston PR2 8DY, England (email@example.com).
Submitted for Publication: March 16, 2004; final revision received January 24, 2005; accepted January 31, 2005.
Funding/Support: This research was supported by a grant from the UK Department of Health Policy Research Program, London.
Acknowledgment: We thank Mark Fenton, MA, and Clive Adams, MD, of the Cochrane Schizophrenia Group for their help with the search strategy.