Study design and randomization groups. Group 1, responders after the 12-week trial who were immediately switched to placebo; group 2, responders after the 12-week trial who continued to receive fluoxetine hydrochloride therapy for an additional 14 weeks and then were switched to placebo; group 3, responders after the 12-week trial who continued to receive fluoxetine therapy for an additional 38 weeks and then were switched to placebo; and group 4, responders after the 12-week trial who continued to receive fluoxetine therapy for an additional 50 weeks.
Kaplan-Meier survival curves comparing survival from weeks 12 to 26 among patients with a "true drug" short-term response pattern; the patients were treated with fluoxetine hydrochloride vs placebo (log-rank test score, 22.37; P<.001). Vertical bars indicate points that individuals drop out before reaching relapse criteria.
Kaplan-Meier survival curves comparing survival from weeks 26 to 50 among patients with a "true drug" short-term response pattern; the patients were treated with fluoxetine hydrochloride vs placebo (log-rank test score, 8.23; df=1; P<.005). Vertical bars indicate points that individuals drop out before reaching relapse criteria.
Kaplan-Meier survival curves comparing survival from weeks 50 to 62 among patients with a "true drug" short-term response pattern; the patients were treated with fluoxetine hydrochloride vs placebo (log-rank test score, 0.71; df=1; P<.41). Vertical bars indicate points that individuals drop out before reaching relapse criteria.
Kaplan-Meier survival curves comparing survival from weeks 12 to 26 among patients with a "placebo" short-term response pattern; the patients were treated with fluoxetine hydrochloride vs placebo (log-rank test score, 1.71; df=1; P<.20). Vertical bars indicate points that individuals drop out before reaching relapse criteria.
Kaplan-Meier survival curves comparing survival from weeks 26 to 50 among patients with a "placebo" short-term response pattern; the patients were treated with fluoxetine hydrochloride vs placebo. The Kaplan-Meier log-rank test was invalidated by an excess of dropouts in the placebo-treated group. Vertical bars indicate points that individuals drop out before reaching relapse criteria.
Kaplan-Meier survival curves comparing survival from weeks 50 to 62 among patients with a "placebo" short-term response pattern; the patients were treated with fluoxetine hydrochloride vs placebo (log-rank test score, 0.19; df=1; P<.67). Vertical bars indicate points that individuals drop out before reaching relapse criteria.
Kaplan-Meier survival curves of patients treated with fluoxetine hydrochloride from weeks 12 to 26 comparing survival of those with a "true drug" vs a "placebo" short-term response pattern (log-rank test score, 6.75; df=1; P<.01). Vertical bars indicate points that individuals drop out before reaching relapse criteria.
Kaplan-Meier survival curves of patients treated with fluoxetine hydrochloride from weeks 26 to 50 comparing survival of those with a "true drug" vs a "placebo" short-term response pattern (log-rank test score, 2.74; df=1; P<.10). Vertical bars indicate points that individuals drop out before reaching relapse criteria.
Kaplan-Meier survival curves of patients treated with fluoxetine hydrochloride from weeks 50 to 62 comparing survival of those with a "true drug" vs a "placebo" short-term response pattern (log-rank test score, 0.86; df=1; P<.36). Vertical bars indicate points that individuals drop out before reaching relapse criteria.
Stewart JW, Quitkin FM, McGrath PJ, Amsterdam J, Fava M, Fawcett J, Reimherr F, Rosenbaum J, Beasley C, Roback P. Use of Pattern Analysis to Predict Differential Relapse of Remitted Patients With Major Depression During 1 Year of Treatment With Fluoxetine or Placebo. Arch Gen Psychiatry. 1998;55(4):334-343. doi:10.1001/archpsyc.55.4.334
Copyright 1998 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.1998
Delayed and persistent ("true drug") improvement characterizes the response to antidepressant medication. Early or nonpersistent ("placebo") benefit is typical of a placebo response. The prediction was that patients with a true drug response would sustain their benefit best if they continued to receive the drug and that patients with a placebo response would have an equivalent prognosis whether they continued to receive the drug or were switched to placebo.
Patients with major depression who met the study's response criteria (a modified Hamilton Depression Rating Scale score ≤7 and failure to meet major depression criteria after each of the last 3 weeks following 12 to 14 weeks of treatment with fluoxetine hydrochloride, 20 mg/d) were enrolled in a 50-week randomized placebo substitution trial during which the return of depressive symptoms defined relapse. The timing and persistency of response during initial treatment defined true drug or placebo response patterns.
Patients with a true drug response pattern relapsed significantly more frequently if they were switched to placebo than if they continued to receive fluoxetine (P<.001 for weeks 12-26, P<.005 for weeks 26-50, and P<.41 for weeks 50-62). Patients with a placebo response pattern had an equivalent outcome whether maintained on fluoxetine therapy or placebo (P<.20 for weeks 12-26, test invalid for weeks 26-50, and P <.67 for weeks 50-62). Patients with a placebo response pattern relapsed more often when they continued to receive fluoxetine than patients with a true drug response pattern (P<.01 for weeks 12-26, P<.10 for weeks 26-50, and P<.36 for weeks 50-62).
These findings confirm that pattern analysis validly differentiates true drug from nonspecific initial responses and extend its use to the continuation and maintenance phases of treatment for depression. Investigations into the mechanisms of antidepressant activity might best be limited to those that can account for delayed efficacy. Fluoxetine's efficacy during the continuation and maintenance phases of treatment may be limited to patients with a true drug pattern of initial response.
DIVERSE timing1,2 and a plethora of consequences1- 15 undoubtedly contribute to the difficulty in specifying the critical mechanisms of antidepressant effects.16 The success of investigations into these mechanisms would be improved if studies were restricted to biological processes having timing similar to that of efficacy. Identifying how antidepressants work might offer clues about the pathophysiological features of depressive illness and allow predictability of treatment outcome.
Treatment response is inherently heterogeneous, rendering problematic its use to validate proposed mechanisms. Typically, two thirds of patients with depression improve while taking active medication and one third remit while taking placebo.17 Subtracting the placebo response, about one third benefit from the specific effects of antidepressant medications and one third improve because of nonspecific effects. That is, of those who improve while taking medication, about half do so for reasons other than direct pharmacological effects. Studies usually compare all patients who improve while taking a drug (ie, a mixed group of those requiring medication plus nonspecific responders) with a control group or with nonresponders. Methods that differentiated a "true drug" response from a nonspecific response would aid research into the biological features of depressive disorders and decisions about whether to continue the administration of an apparently effective medication.
Pattern analysis18 makes this distinction. Several studies18- 21 validate a delayed and persistent short-term response as characterizing patients who improve while receiving antidepressant medications, whereas early or nonpersistent responses are typical of patients responding to placebo. A demonstration that short-term response patterns alter subsequent prognosis would further validate pattern analysis. This article examines the predictive validity of pattern analysis in a 14-month study of fluoxetine.
To test the hypotheses that the timing and persistence of a short-term benefit will predict long-term outcome, we used data from a multisite study designed to determine the optimal duration of fluoxetine treatment. Five sites enrolled patients with major depression into an open, 12-week trial of fluoxetine; responders then received continued therapy for up to 1 year, during which three quarters were switched to placebo at 1 of 3 points (Figure 1). Each site followed an identical protocol.
All study subjects were outpatients aged between 18 and 65 years, were physically healthy, met the DSM-III-R22 criteria for major depression, and had a modified 17-item Hamilton Depression Rating Scale23 (mHAMD) score of 16 or more. The mHAMD substituted hypersomnia and hyperphagia for insomnia, anorexia, and weight loss items in patients with reverse vegetative symptoms.
Patients were excluded for unstable medical illnesses, pregnancy or lactation, serious suicidal impulses, history of psychosis, organic mental disorder, mania or antisocial personality disorder, substance abuse disorder in the past year, hypothyroidism, previous fluoxetine treatment for longer than 1 week in the current episode or longer than 3 months in a previous episode, or lack of a response to 8 weeks of treatment with fluoxetine hydrochloride, 20 mg/d.
Signed, informed consent was obtained from each patient and approved by each site's institutional review board. Patients were told that the Eli Lilly Corp, Indianapolis, Ind, sponsored the study to determine the long-term efficacy of fluoxetine and that if their condition initially improved, placebo might be substituted for fluoxetine at some time during the ensuing 11 months. If depressive symptoms returned during the double-blind phase, they were instructed to call immediately for an evaluation. They were told that treatment would continue after the end of the study, including the reinstitution of fluoxetine therapy if appropriate.
Patients whose symptoms persisted during a drug-free observation week were treated openly with fluoxetine hydrochloride, 20 mg/d, for 12 weeks. The mHAMD and Clinical Global Impression24 Global Severity scores were obtained at each visit (weekly for weeks 1-4 and 11-12 and biweekly for weeks 6-10). Remission was defined as an mHAMD score of 7 or less and failure to meet the DSM-III-R criteria for major depression after each of the last 3 weeks of the 12-week trial. For most patients, this was weeks 10 through 12. If the mHAMD score equaled 8 to 10 at week 10 or 11, they continued to receive fluoxetine 1 or 2 extra weeks, and the ratings for weeks 11 through 13 or 12 through 14 determined whether the remission criteria were met.
Prior studies18- 21 derived response patterns from the Clinical Global Impression Global Improvement score, which was not obtained in this trial. We substituted the Clinical Global Impression Global Severity score. A score of 1 ("no psychopathology") or 2 ("minimal psychopathology") identified an improved week; other scores (ie, 3 ["mild psychopathology"] to 7 ["extreme psychopathology"]) defined unimproved weeks.
Improvement was early if week 1 or 2 was rated improved. Late improvement occurred if the first improved week was after week 2. Persistent response occurred if an improved week was never followed by an unimproved week. The pattern was nonpersistent if an unimproved week followed an improved week.
Responders were randomly assigned within site by a computer-generated code to 1 of 4 groups: 1, immediate placebo substitution; 2, continuation of fluoxetine therapy for 14 weeks followed by placebo substitution; 3, continuation of fluoxetine therapy for 38 weeks followed by placebo substitution; or 4, continuation of fluoxetine therapy for 50 weeks.
Opaque envelopes contained each patient's treatment code, to be opened only in an emergency. Otherwise, the treatment code was known only to Eli Lilly Corp staff not involved with clinical decisions. Placebo substitution was immediate, without tapering.
After randomization, treatment was double-blind and patients were seen weekly for 2 weeks, then biweekly for 16 weeks, and then monthly. Extra visits were scheduled if depressive symptoms returned. Relapse was defined as either (1) an mHAMD score of 14 or more for 3 consecutive weeks or (2) having met the DSM-III-R criteria for major depression for 2 consecutive weeks.
Three hypotheses were studied:
Patients with a true drug response pattern will relapse more often while taking placebo than while taking continued medication.
The relapse rates of patients with a placebo pattern of response will not depend on treatment.
While continuing to receive fluoxetine, patients with placebo response patterns will relapse more frequently than patients with true drug response patterns.
First, randomization groups were compared for distribution of pattern type. Next, each pair of groups to be compared in the analyses was tested for site, demographic, and illness differences. Sites were compared for different rates of relapse and pattern type. Continuous variables were tested by the t test, and categorical variables were tested by the χ2 test with Yates correction in 2×2 contingency tables. Because 117 comparisons were performed (13 variables times 3 hypotheses each tested at 3 points), we considered P<.001 to identify nonrandom differences.
The primary analysis for each hypothesis tested whether relapse during weeks 12 to 26 differed between comparison groups. The log-rank test assessed Kaplan-Meier survival curves for differences in time to relapse between groups.
Secondary analyses included similar Kaplan-Meier analyses for weeks 26 to 50 and 50 to 62.
Hypothesis 2, predicting no difference between treatments in patients with a placebo response pattern, asserts the null hypothesis, which cannot be directly demonstrated. Therefore, in addition to conventional significance testing, we also provide a power analysis to suggest the smallest relapse difference this sample size could be expected to detect. Only if there is the power to detect a difference too small to be clinically meaningful would it be reasonable to accept the hypothesis of equivalence.
All analyses were performed for the period immediately following each placebo substitution point (ie, 12, 26, and 50 weeks) until the next substitution point or end of the study. Analyses testing hypothesis 1 compared patients with the true drug pattern randomized to group 1 (placebo) (n=63) with those randomized to groups 2 through 4 (continued fluoxetine therapy) (n=191) for weeks 12 to 26 (primary analysis); those in group 2 (placebo) (n=41) with those in groups 3 and 4 (continued fluoxetine therapy) (n=69) for weeks 26 to 50; and those in group 3 (placebo) (n=26) with those in group 4 (continued fluoxetine therapy) (n=19) for weeks 50 to 62 (secondary analyses). For hypothesis 2, analyses compared patients with the placebo pattern randomized to group 1 (placebo) (n=32) with those randomized to groups 2 through 4 (continued fluoxetine therapy) (n=106) for weeks 12 to 26 (primary analysis); those in group 2 (placebo) (n=11) with those in group 3 or 4 (continued fluoxetine therapy) (n=37) for weeks 26 to 50; and those in group 3 (placebo) (n=8) with those in group 4 (continued fluoxetine therapy) (n=9) for weeks 50 to 62 (secondary analyses). Analyses for hypothesis 3 compared fluoxetine-treated patients with a true drug pattern with fluoxetine-treated patients with a placebo pattern. This was groups 2 through 4 (n=191 and n=106) for weeks 12 to 26 (primary analysis) and groups 3 and 4 (n=69 and n=37) for weeks 26 to 50 and group 4 (n=19 and n=9) for weeks 50 to 62 (secondary analyses) (the first n values are for patients with a true drug pattern of initial response, and the second n values are for patients with a placebo pattern of initial response).
Patients who did not complete the study period were censored after they left the study. All tests of hypotheses were 2 tailed and used an α=.05.
The use of short-term response patterns in predicting outcome is examined; the relationship of outcome to fluoxetine blood levels has been reported,25 while other study aspects will be reported elsewhere, including short-term treatment outcome, general outcome during continuation and maintenance therapy and side effects. Of 839 patients who began to receive medication, 428 met the response criteria. Of these 428 patients, 33 refused treatment or were otherwise determined to be inappropriate candidates for treatment (eg, they had an intercurrent illness or they had moved) and 3 were missing a week 2 rating. Thus, patterns were available for 392 randomized patients who constitute the sample of this report. Table 1 provides the characteristics of these patients.
The distribution of patterns among the 4 randomization groups did not differ (Table 2). The 9 pairs of comparison groups did not differ in the distribution of response type or demographic or illness variables (Table 3). The response patterns differed by site as follows:.
For this analysis, χ2=9.73, df=4, and P<.05. The data are given as the percentage (number) of patients.
Relapse rates also differed among sites (Table 4). The effect of site on outcome was, therefore, assessed separately for each period, assuming a proportional hazard for time to relapse. The hazard for relapse was modeled as a function of pattern, treatment, and site, including all 2- and 3-factor interactions. The Cox proportional hazards analysis did not show a significant site-by-pattern-by-treatment interaction for any period: for the 12-week analysis, Wald test score=5.02, df=1, and P<.29; for the 26-week analysis, Wald testscore=0.04, df=1, and P<.95; and for the 50-week analysis, Wald test score=0, df=1, and P≤.99). We also tested whether patients with placebo patterns had different outcomes depending on whether they had an early initial response or a late and nonpersistent initial response. The Kaplan-Meier survival analysis was not significant for any of the periods (log-rank test score=0.09, df=1, P<.77 for weeks 19-26; log-rank test score=0.14, df=1, P<.71 for weeks 26-50; and log-rank test score=1.21, df=1, P<.28 for weeks 50-62). Because sex, site, or placebo pattern subtype were not significant in the analyses described, subsequent analyses did not adjust for these factors.
Finally, we investigated whether patients dropped out of the study equally among comparison groups. Patients with a placebo response pattern who were receiving placebo during weeks 26 to 50 dropped out significantly more often than patients with the placebo response pattern who continued to receive fluoxetine, invalidating an assumption of the Kaplan-Meier analysis. The other 8 comparison groups did not differ in their likelihood of dropping out during the risk period (Table 5).
Table 6 and Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8, Figure 9, and Figure 10 summarize the survival analyses.
Among patients with a true drug short-term response pattern, the results of the primary weeks 12 to 26 Kaplan-Meier survival analysis were significant (Figure 2). The weeks 26 to 50 Kaplan-Meier survival analysis (Figure 3) also demonstrated a significantly higher relapse in placebo-treated patients. The results of the weeks 50 to 62 Kaplan-Meier survival analysis were not significant (Figure 4), although the power was poor because of the small sample size.
None of the primary or secondary Kaplan-Meier analyses demonstrated significantly greater relapse with the administration of placebo than with the administration of continued fluoxetine therapy (Figure 5, Figure 6, and Figure 7). However, the sample size was adequate to detect a ratio of relapse with the administration of continued fluoxetine therapy vs placebo of 2.7 in the weeks 12 to 26 analysis and 3.7 in the weeks 26 to 50 analysis. That is, the weeks 12 to 26 analysis had an 80% chance of finding relapse rates of 20% vs 54% to be statistically significant. The weeks 26 to 50 analysis would have required the rates to be at least as different as 20% and 74% to have had an 80% chance of a statistically significant finding. In addition, the high dropout rate in the placebo group available for the weeks 26 to 50 analysis invalidated the use of the Kaplan-Meier analysis.
The primary (weeks 12-26) analysis demonstrated significantly less relapse with the continued administration of fluoxetine therapy in patients with a true drug response pattern compared with those with a placebo response pattern (Figure 8). The Kaplan-Meier survival analyses for weeks 26 to 50 and 50 to 62 were not significant at P=.05 (Figure 9 and Figure 10).
Pattern analysis differentiates true drug response from placebo response during 6- to 12-week initial treatment trials.18- 21 The results of this study demonstrate the predictive validity of pattern analysis for the continuation (the period from the end of short-term treatment until 6 months) and maintenance (beyond 6 months) phases of treatment. Hypotheses were largely confirmed because the continued administration of medication was demonstrated to be effective in patients with a true drug short-term treatment response pattern and not in patients with other short-term response patterns. Patients with a placebo short-term response pattern had a similar prognosis whether they continued to receive fluoxetine therapy or not, although the sample size allowed at best modest power to detect differences. In contrast, following the primary switch point and 1 of the 2 secondary switch points, patients who had a true drug short-term response were significantly more likely to relapse while receiving placebo than while continuing to receive fluoxetine (P<.001, P<.005, and P<.41, respectively). After the primary 12-week switch point, fluoxetine less effectively prevented relapse in patients who had a placebo short-term response than in those with a true drug short-term response (P<.01). Thus, all 3 primary analyses, as well as 1 secondary analysis, confirmed that pattern analysis differentiates nonspecific benefit from improvement attributable to specific drug effects.
Because studies validating pattern analysis have included tricyclic antidepressants,18,19 monoamine oxidase inhibitors,18,19 and selective serotonin reuptake inhibitors,20,21 this technique may apply generally to antidepressant medications. Pattern analysis might be applicable to other treatments characterized by the delayed onset of persistent effects, such as lithium carbonate, antipsychotic medications, anticonvulsants, and electroconvulsive therapy. Protein synthesis represents a candidate mechanism for delayed action of treatments because its initiation takes days to weeks, while immediate effects on reuptake or receptors would be candidates if they initiated a cascade of other effects over time.
These data suggest that patients with a placebo pattern of short-term response present the clinician with a dilemma. Regardless of whether fluoxetine therapy is continued, a seemingly high likelihood of relapse can be expected. If fluoxetine therapy was discontinued in a patient with a placebo response pattern, the recurrence of depressive symptoms would lead to the assumption that the medication had been responsible for the initial improvement. These results suggest that this conclusion might be inaccurate, but it likely would be reached anyway. The discontinuation of an apparently effective agent is, therefore, problematic. Continuation would also be questionable. Although these results call into question the prophylactic efficacy of fluoxetine in patients with a placebo pattern of short-term response, as no treatment has demonstrated efficacy in patients whose short-term response could be considered to be nonspecific, we have no recommendations to make for the patient with a placebo pattern response.
Loss of apparent benefit has clinical and heuristic implications. Some dramatically improved patients later relapse despite the continuation of treatment. Others relapse when the drug is withdrawn, then do not improve when the medication is reinstituted. Pattern analysis suggests that some relapse while continuing to receive medication represents loss of a placebo effect rather than loss of a true drug effect. In this study, for example, of patients relapsing while continuing to receive fluoxetine therapy, almost half (48%) had a nonspecific response pattern. Similarly, some failure to improve on second exposure to the same medication might be due to loss of a placebo effect, rather than acclimatization to pharmacological effects. Attempts to understand the treatment course of individual patients must consider the possibility that initial response was not due to direct medication effects.
Unequal assignment among randomization groups could produce spurious results. For example, if men and women relapsed at different rates and were differently distributed among groups, ignoring sex effects in the analyses could yield results that seem to implicate other factors. Indeed, patterns and relapse were differentially distributed among treatment sites. However, the Cox proportional hazards analysis did not reveal significant site-by-pattern-by-treatment interaction effects on time to relapse (P<.29 for weeks 12-26, P<.95 for weeks 26-50, and P<.99 for weeks 50-62). Other demographic and illness variables were not differentially distributed among response type, site, or comparison groups.
Carryover and withdrawal effects are a concern in any discontinuation trial. Fluoxetine is not associated with withdrawal symptoms. However, the relatively long half-life of fluoxetine and its active metabolite, norfluoxetine, might delay relapse in patients switched to placebo. The resulting underestimate of relapse in placebo-treated patients could result in failure to show between-group differences. This is one reason for caution in accepting failure to demonstrate differences in addressing hypothesis 2.
An additional problem with hypothesis 2 is that in predicting no difference, it asserts the null hypothesis. What can be demonstrated is that if a difference exists it is probably smaller than x. If x is small enough, it might be judged inconsequential. The patients with placebo response patterns in this study had an 80% chance of demonstrating between-treatment differences in the weeks 12 to 26 analysis on the order of 20% vs 54%. A relapse difference of 25% vs 50% might have been missed. The detectable differences in the weeks 26 to 50 and 50 to 62 analyses were even larger. Hypothesis 2 might best be considered not disconfirmed.
These results must be replicated before they should influence clinical practice. While validated for short-term treatment with 3 classes of antidepressant medications, to our knowledge, this study is the first assessing its use beyond short-term treatment. The predictive validity of pattern analysis must be replicated for fluoxetine and extended to other selective serotonin reuptake inhibitor and non–selective serotonin reuptake inhibitor antidepressant medications.
In summary, after the 3- and 6-month switch points, pattern analysis prospectively identified one patient group that showed continued medication effects (ie, different relapse rates while receiving drug vs placebo) and another group that did not. Patients who were more likely to relapse while continuing to receive fluoxetine therapy were also identified. These and previous results suggest that delayed-persistent short-term responses may identify with 80% likelihood patients who had a true drug response, while early or nonpersistent initial responses select patients with a high proportion of nonspecific improvement. In attempting to identify the mechanism of action of antidepressants, limiting inquiries to mechanisms that are delayed and persistent and testing candidate mechanisms only in patients identified as having a true drug response will likely maximize the chances of demonstrating these mechanisms.
Accepted for publication July 30, 1997.
This study was supported by the Eli Lilly Corp, Indianapolis, Ind.
Reprints: Jonathan W. Stewart, MD, 1051 Riverside Dr, Unit 35, New York, NY 10032.