Model-based Brief Psychiatric Rating Scale (BPRS) change score mean and standard error estimates for the 3 different design types (placebo-controlled, low dose–controlled, or active-controlled) across dose ranges on baseline to end point improvement among 32 controlled trials of atypical antipsychotic medications for patients with acutely exacerbated schizophrenia. Analysis controls for effects of drug and baseline BPRS score. Significance values reflect individual contrasts within dose range.
Relationship between effects of design and dosing practice on improvement. Mean and standard error estimates reflect interaction between dosing practice (flexible vs fixed) and study design on baseline to end point improvement on the Brief Psychiatric Rating Scale (BPRS) among 21 placebo-controlled or active-controlled trials of atypical antipsychotic medications for patients with acutely exacerbated schizophrenia. Analysis controls for effects of drug and baseline BPRS score.
Woods SW, Gueorguieva RV, Baker CB, Makuch RW. Control Group Bias in Randomized Atypical Antipsychotic Medication Trials for Schizophrenia. Arch Gen Psychiatry. 2005;62(9):961-970. doi:10.1001/archpsyc.62.9.961
It has been suggested that the need for concurrent placebo control groups in new schizophrenia studies might be minimized by making comparisons with external placebo. This strategy requires an assumption of constancy, that the novel medication will perform the same way in a study with only active controls as it would have in a placebo-controlled trial.
To test this constancy assumption in active- and low dose–controlled trials vs placebo-controlled trials of atypical antipsychotic medications.
A comprehensive search of bibliographic databases, conference proceedings, and Food and Drug Administration databases through November 24, 2003.
English-language, randomized, double-blind clinical trials of newer atypical antipsychotic medications in otherwise unselected acutely ill adults with schizophrenia or schizoaffective disorder that reported baseline and intent-to-treat end point change values for the Brief Psychiatric Rating Scale.
Study number, sample size, reported dosage for each arm, trial duration, percentage of men, average age, trial arm treatment completion rate, baseline and within-arm end point change in the Brief Psychiatric Rating Scale, method of scoring the Brief Psychiatric Rating Scale, and date of publication were extracted from each study independently by 2 of us (S.W.W. and C.B.B.) each.
There were 32 studies comprising 66 risperidone, olanzapine, quetiapine, and ziprasidone arms including 7264 patients. Random-effects analysis revealed that in atypical antipsychotic medication arms, the degree of improvement was nearly double in active-controlled trials than that seen with the same drugs and dosages in placebo-controlled studies. An effect of design was also observed in low dose–controlled studies vs placebo-controlled studies in ineffective and intermediate antipsychotic medication dose ranges.
The observed control group bias indicates that the constancy assumption does not hold in recent antipsychotic medication trials. These results suggest that caution is indicated when considering active- or low dose–controlled studies requiring comparisons with external placebo as alternatives to placebo-controlled trials for establishing efficacy of new medications for schizophrenia.
Does the degree of improvement with antipsychotic medication in a clinical trial differ depending on the control group that is chosen for the trial? This question arises as part of the ongoing ethical debate on the use of placebo controls in studies intending to establish efficacy of novel medications for schizophrenia.1- 8
Two randomized, double-blind alternatives to the placebo controls, active controls and low dose–controls, have been increasingly discussed in recent years. In the active-controlled design, a therapeutic dose of a standard medication replaces placebo as the control group with which novel medication is compared. In turn, active-controlled studies can use 1 of 3 statistical tests—superiority, noninferiority, or equivalence7; superiority to an active control constitutes a very stringent test and is rarely used.6 A noninferiority trial hypothesizes that the new treatment is not worse than the standard minus a specified margin. Equivalence trials are similar to noninferiority trials but use a 2-sided test. We will refer to the latter 2 trial types together hereafter as “active-controlled” studies. Low dose–controlled studies using doses of the experimental medication believed too low to have an important effect provide another alternative to placebo. Low-dose controls, like placebo controls, have also been associated with ethical concerns.9,10 Discussion of active controls and low-dose controls as alternatives to placebo have involved 2 primary ethical themes.
The first ethical theme is what degree of risk, if any, a competent subject may reasonably be invited to accept to help research benefit future patients. Randomization to placebo or low doses requires inviting subjects to accept a risk of being assigned to a treatment condition that is known to be or believed to be less effective, on average, than established options. Differing ethical positions have been taken by professional and regulatory authorities on this issue.8,11- 15 Empirical data on risk during placebo phases are beginning to contribute to this discussion. Three reports6,16,17 have found no evidence of increased risk for suicide during placebo periods among recent clinical trials for schizophrenia. There are still few empirical data on the possibility of other lasting harm after placebo periods, especially in acutely ill patients. It is unfortunate that such data are not routinely collected from patients after trial participation.18 In initially stable patients, some data are available mostly from small studies of exacerbation after medication substitution by placebo.19,20
The second major ethical theme is whether future patients can benefit from knowledge obtained when an active control is substituted for placebo. A major criticism of active-controlled trials21,22 has been that they lack “assay sensitivity,” meaning that one cannot determine strictly from evidence internal to the study whether equivalent or noninferior treatments were both effective or both ineffective. To conclude that an equivalent or noninferior novel treatment in an active-controlled trial is effective, one must assume that the standard treatment would have been superior to placebo if placebo had been included.21 A recent meta-analysis has shown that therapeutic doses of atypical antipsychotic medications were nearly always superior to placebo for schizophrenia, even when accounting for unpublished trials.23 It is very likely, then, that the next time a new sample of patients from the same population is randomized to atypical antipsychotic medication or placebo, the atypical medication will prove superior to placebo. These results have been cited as increasing the scientific credibility of active controls in efficacy studies for schizophrenia.7
The present article addresses another issue that should also be considered in evaluating the scientific validity of studies using control group alternatives to placebo for schizophrenia. An active-controlled trial cannot be assumed to sample from the same patient population as the previous placebo-controlled trials. Thus, it is still possible that both the standard treatment and the novel treatment could be ineffective in the patient group actually studied. Moreover, when an equivalence or noninferiority margin is defined on clinical grounds, a novel treatment could fall within the margin but still not have been superior to placebo if placebo had been included. Because of these possibilities, some have suggested that an active-controlled trial should not only demonstrate equivalence or noninferiority to an appropriate standard but should also demonstrate that the novel treatment would have been superior to placebo if a placebo group had been included.24- 30 One proposed approach is to make statistical comparisons with appropriate external placebo data.24- 30 Recently, a second, related approach has been developed that seeks to define the equivalence or noninferiority margin using external data from trials that compared the standard with placebo.30- 33
Apart from uncertainties in determining how to appropriately select external placebo data,25,26,34- 36 which will not be addressed further in the present article, another difficulty with these methods will be the subject of this article. Both of the 2 approaches for comparing a novel medication arm in an active- or low dose–controlled trial with external placebo require the assumption that a novel treatment would perform the same way in the active- or low dose–controlled trial as it would have in a placebo-controlled trial. This assumption underlying these methods has been termed the “constancy assumption.”30,32,37 When the constancy assumption is violated, analyses that depend on external placebo are difficult to interpret.30,32,37
The focus of the present article, therefore, is to determine whether the constancy assumption holds for studies of antipsychotic medications for schizophrenia, that is, whether antipsychotic medications for schizophrenia perform the same way in active- or low dose–controlled trials as they do in placebo-controlled studies. The present article assembles data from controlled trials of 4 atypical antipsychotic medications and compares improvement from baseline across study design while controlling for drug, dose, and baseline severity. Additional analyses also control for treatment arm completion, age, sex, duration of trial, dosing practice (flexible vs fixed), and year of publication.
We first made a comprehensive search for English-language, randomized, double-blind clinical trials of risperidone, olanzapine, quetiapine, ziprasidone, or aripiprazole in acutely ill adults with schizophrenia or schizoaffective disorder. Clozapine was not included because there have been no placebo-controlled trials. We searched through November 24, 2003, in MEDLINE, BIOSIS, PsychINFO, and Current Contents, citations in identified reports, published meta-analyses and reviews,38- 49 6 Cochrane reviews,50- 55 and Freedom of Information Act material available from the Food and Drug Administration.56 We also searched, by hand, abstracts from the 1997 to 2003 Annual Meetings of the American Psychiatric Association, the 1999, 2001, and 2003 International Congresses on Schizophrenia Research, and the 2000 and 2002 Biennial Winter Workshops on Schizophrenia. When data were missing from available reports, we queried the manufacturer or first author directly.
To be included, studies must have reported doses achieved as well as baseline and intent-to-treat end point change values for the Brief Psychiatric Rating Scale (BPRS).57 The BPRS change score rather than the response rate was used as the index of improvement because response rate definitions varied across studies.23 Some studies reported the Positive and Negative Syndrome Scale58 scores instead of the BPRS scores, but the BPRS/Positive and Negative Syndrome Scale score ratio on improvement in studies reporting both did not appear consistent enough to use Positive and Negative Syndrome Scale data to provide an estimated BPRS change score. Studies that specifically focused on treatment-resistant, previously responsive, early phase, anxious, or stable patients with schizophrenia, long term extension, or relapse prevention were excluded because the samples differed from the unselected acutely ill samples of the other studies. The standard errors for the end point BPRS change scores for the combined Canadian59 and US60 subsamples of risperidone study 204 were calculated exactly from published data. The standard error of the change mean was not reported for risperidone study BEL-761 and for olanzapine study HGDD62 but could be estimated from the standard errors of the baseline mean and end point mean values, conservatively treating the baseline and end point values independently. Aripiprazole could not be included in the analysis, despite there being 5 placebo-controlled studies,63- 65 because no low dose–controlled studies were available and, in both of 2 available active-controlled studies, patients were selected for documented histories of previous response.66
One of us (S.W.W.) initially extracted all of the data, and each value was independently confirmed by another of us (C.B.B.). Occasional differences were resolved by consensus. Extracted data from each study included the following: study number, sample size, reported dosage for each arm, trial duration, percentage of men, average age, trial arm treatment completion rate, baseline and end point change in the BPRS score, the method of scoring the BPRS, and date of publication. We report the BPRS scores using 1- to 7-item scaling. When studies did not report the BPRS scaling method, we assumed 1- to 7-item scaling as in the original instrument. In flexible dose studies, when the doses achieved were reported in multiple formats, the end point mean dose achieved was used. If sex or age was not available for each arm in the trial, values for the entire trial were substituted. In the cases of duplicate or overlapping publications, the earliest date of publication was selected. Poster presentations were considered unpublished.
Each study was categorized into 1 of 3 design types: (1) placebo-controlled design, (2) low dose–controlled design, or (3) active-controlled design. Each risperidone, olanzapine, quetiapine, or ziprasidone arm within each study was categorized within 1 of 3 dose ranges: (1) ineffective doses like those used as low-dose controls, (2) effective doses that consistently separated from placebo,23,67 and (3) intermediate doses.
The dependent measure for improvement within atypical antipsychotic medication arms was the BPRS change score mean for each study arm. Only data for risperidone, olanzapine, quetiapine, and ziprasidone arms were included. The principal model contained study, baseline mean, drug, dose, and design variables as predictors of the BPRS change score mean. A random-effects model with study as a random effect68- 70 was used to account for correlations between treatment arms within studies. Other variables were considered as fixed predictors. Because of 2 empty cells in the dose-by-design matrix (Table 1; no ineffective or intermediate dose arms exist in active-controlled trials), it was necessary to create a combined categorical dose-design variable instead of considering main effects of dose, design, and dose-by-design interaction. The combined variable had 7 levels corresponding to the nonzero cells in Table 2. Analysis of the combined variable is hereafter referred to as “the omnibus analysis.” Post hoc contrasts then permitted examination of design effects within each level of dosage. The analysis was performed using SAS PROC MIXED98 with the standard errors of the change score means held fixed at their reported values and the between-study variance estimated together with the fixed effects. The data were also analyzed with a fixed-effects model for comparison purposes. To test other possible modifiers or mediators of the observed design effects, 6 additional covariates (mean age, percentage of men, duration of trial, flexible vs fixed dosing, treatment arm completion rate, and date of publication) were added to the model, 1 at a time. Sample size limitations prevented simultaneous evaluation of these variables or investigation of interactions among them. Lastly, additional random-effects models used baseline BPRS score mean, mean age, percentage of men, duration of trial, and treatment arm completion rate as the dependent measures.
Table 1 shows the methodological detail for each study included. There were 32 studies56,59- 62,71- 97 comprising 66 risperidone, olanzapine, quetiapine, and ziprasidone arms including 7264 patients. Trial durations ranged from 4 to 30 weeks. Each of the 4 medications has at least 1 study representing each of the 3 design types. Each of the 3 dose ranges is also represented within each drug.
The omnibus test for the BPRS change score mean was significant (F6,27 = 15.9, P<.001). Design effect contrasts within each dose range are provided in Table 3. The observed BPRS change score mean at effective doses was significantly higher for active-controlled designs than for either placebo- or low dose–controlled designs (Figure 1). A significant effect of study design was also observed for low dose–controlled designs as compared with placebo-controlled designs at intermediate doses and ineffective doses but not at effective doses.
The potential confounding effect of the drug was controlled in the principal model. The interaction between drug and design was not significant.
The potential confounding effect of dose was also controlled in the principal model. Dosage had a significant effect on the BPRS change score mean (F2,27 = 37.1, P<.001) when averaging over design types. Improvement on each dose within each design is shown in Figure 1.
In analyses where the baseline BPRS score was the dependent measure, the only significant predictor was drug (F3,31 = 5.35, P = .004). Quetiapine was associated with a significantly higher BPRS score baseline in the model (mean, 59.4) than risperidone (mean, 52.6), olanzapine (mean, 52.8), and ziprasidone (mean, 52.2). The overall effect of design on the baseline BPRS score was not significant (F2,27 = 1.14, P = .34). The baseline BPRS score mean had a significant effect on the BPRS change score mean (F1,27 = 21.40, P<.001). Higher baselines were independently associated with greater improvement scores. The potential confounding effect of baseline on BPRS change score was controlled in the principal model.
All of the significant effects in the random-effects model were more highly significant in the fixed-effects model. However, according to Akaike’s Information Criterion and Schwartz’s Bayesian Criterion,98 the fixed-effects model did not provide as good a fit to the data as the random-effects model. The fixed-effects model is also not intuitively appropriate because it assumes that the change score means for different arms of the same study are uncorrelated.
In an omnibus test where completion rate was the dependent measure, completion varied significantly with design (F6,29 = 3.59, P = .009). Atypical antipsychotic medication trial arm completion rates controlling for drug and dose range were highest for active-controlled studies (84%), lower for low dose–controlled studies (66%), and lower yet for placebo-controlled studies (58%). Whenadded to the principal model, the treatment arm completion rate was independently and significantly associated with the mean BPRS change score (F1,26 = 20.84, P<.001). The magnitude of the design effect on the mean BPRS change score was somewhat reduced for all of the significant contrasts when completion was included, but the 95% confidence intervals (CIs) generally continued to not overlap 0 (Table 3).
Trial duration was not significantly associated with design and had no significant influence on improvement when added to the principal model. Inclusion of trial duration as a covariate had little effect on the design contrasts.
Sex was independently associated with design (F2,34 = 7.57, P = .002). There was a higher proportion of men in placebo-controlled studies (78%) than in low dose–controlled (73%) or active-controlled (64%) studies. However, when added to the principal model, sex had no significant influence on improvement. Inclusion of sex had little effect on the design contrasts.
Age was not significantly associated with design but was independently and significantly associated with the mean BPRS change score (F1,26 = 10.44, P = .003) when added to the principal model. Older average age was associated with lower average change scores. Adding age to the model did not affect the significance of any of the design contrasts.
Flexible dosing procedures were used in 21 of the 66 arms, and fixed dosing procedures were used in the remaining 45 arms (Table 1). Because of a strong association between design and dosing practice (P<.001, Fisher exact test), it was possible to investigate the independent effects of dosing and design only in effective dose arms and placebo-controlled designs vs active-controlled designs (Figure 2). Estimating the interactive effect between dosing and design in this model, the design effect was significant when dosing was fixed (mean difference, 10.26; 95% CI, 6.12-14.41). When dosing was flexible, the design effect was in the same direction but was not significant (mean difference, 4.36; 95% CI, −1.40 to 9.02). Dosing effects were significant among active-controlled trials (mean difference between fixed vs flexible dosing at effective dose, 4.36; 95% CI, 0.30-8.42). Dosing effects were not significant among placebo-controlled trials (mean difference, −2.10; 95% CI, −7.62 to 3.43).
Twenty-four of the 32 studies had been published. Years of publication were 1993 to 1999 for placebo-controlled studies, 1995 to 1998 for low dose–controlled studies, and 1993 to 2003 for active-controlled studies. Ziprasidone trials had to be excluded from this analysis because there were no published active-controlled studies. When the principal analysis was applied to the remaining 21 published studies, the design contrasts were similar to those shown in Table 3. When year of publication was added to the model, little change occurred.
The main finding of the current analyses is that in active-controlled trials, the degree of improvement from baseline in atypical antipsychotic medication arms was nearly double the improvement seen with the same drugs and doses in placebo-controlled studies (Figure 1; Table 3). By contrast, in low dose–controlled studies, no such effect of design was observed as compared with placebo-controlled studies at effective doses. In the intermediate and ineffective dose ranges, however, an effect of design in low dose–controlled studies vs placebo-controlled studies was again apparent. The significant design effects were independent from effects of drug, baseline severity, sex, age, trial duration, and year of publication. Similar design effects were observed on antipsychotic medication arm completion rate, as discussed later. The design effect on improvement was very strong among fixed dose studies and in the same direction, although weaker, among the smaller number of flexible dose studies.
We are not aware of previous demonstrations of a design effect on improvement with antipsychotic medications. For depression, 1 previous meta-analysis of randomized, double-blind trials99 had similar findings in outpatient studies. The response rate in active arms was 58.1% among studies without a placebo cell and 50.6% among studies with a placebo cell. It is not clear from the previous report, however, whether this difference was significant. No design effect was observed among inpatient studies for depression.
As noted earlier, most potential mechanisms that we were able to investigate did not appear to account for the observed design effect, including influences of baseline severity, sex, age, trial duration, and dosing practices.
A factor that did appear to explain some part of the design effect on improvement was trial arm completion. Since the design contrasts on improvement remained significant when controlling for completion rate, however, this mechanism does not account for the bulk of the design effect. The pattern of results suggests that completion partly mediates100 the effect of design on improvement. Partial mediation would be suggested if part of the effect of design on improvement could be explained by design first influencing completion, which, in turn, affected improvement. Design could potentially influence completion rates via prescribing behavior. Prescribers may be more concerned about the possibility of patient nonresponse or worsening when conducting placebo-controlled trials and, consequently, could be more reluctant to wait for improvement and more likely to remove patients from the trial early. Our finding of a design effect on trial completion is consistent with a previous study that also reported a higher active arm dropout rate in placebo-controlled studies than in active-controlled antipsychotic medication studies.101 Completion rates could then influence improvement if the last observation carried forward method of imputing data missing after dropout is used. Patients who drop out of the study have less time to improve and leave a less improved last value to carry forward. The last observation carried forward method was consistently used in these antipsychotic medication studies.
Mechanisms that we were unable to investigate that could potentially account for the design effect on improvement include selection bias and expectancy bias. Selection biases between randomized trials and usual care have been described for schizophrenia,102- 104 and selection biases may well exist among randomized trials between placebo-controlled designs and other designs. Selection bias could produce the effects observed here if patients with relatively good prognoses tended to enter active-controlled trials while patients with relatively poor prognoses tended to enter placebo-controlled trials. In support of this speculation, fewer patients are willing to participate in placebo-controlled trials than in active-controlled trials.105,106 Patients whose previous treatment response has been relatively poor could be more likely to view the risks of placebo as less important than the possible benefits of a new agent. Such prognostic differences across samples could exist even though we excluded trials that specifically sought treatment-resistant patients. Although we did not observe a baseline severity difference among trial designs, which is similar to findings from another meta-analysis,107 treatment responsiveness or prognosis is not necessarily correlated with baseline severity of untreated illness.
Expectancy bias could produce the effects observed in the present report if patients, raters, or both have a higher expectation of improvement in low dose–controlled trials vs placebo-controlled trials and still higher expectations in active-controlled trials. This hierarchy of expectations appears rational given the probabilities of receiving treatment known or believed to be effective that are associated with each design; however, whether patient or rater expectations are actually different across trial designs should be tested empirically in future studies. Expectancy effects have been described for other treatments.108,109
Our sex findings replicate a previous report of a high proportion of men in placebo-controlled studies, and our finding of no difference in age across trial designs is also consistent with this study.
Dosing practice independently affected improvement among active-controlled trials in this analysis (Figure 2). Among active-controlled trials but not placebo-controlled trials, fixed dosing was associated with greater improvement than flexible dosing. These results do not conflict with analyses of active arms in placebo-controlled antidepressant studies.110
Year of publication influenced response rate in both active arms and placebo arms in placebo-controlled antidepressant medication trials published between 1981 and 2000.111 Some evidence from schizophrenia studies suggests a reduction in response rates in conventional antipsychotic medication arms between the 1960s and early 1990s,14 and other evidence suggests a temporal increase in the dropout rates in more recent studies.101 We did not see a temporal effect in our analysis of BPRS change score among atypical antipsychotic medication arms in trials published between 1993 and 2003.
Caveats apply to the interpretation of the data. First, unlike many meta-analyses, this analysis focused on change within treatment arm from baseline to end point rather than on effect size estimates of differences between the investigational agent and the comparator. Focusing on change within treatment arm was both desirable and necessary for the current analysis. This focus was desirable because the question we are addressing is whether a change within the antipsychotic medication arm from 1 study can be compared with change within placebo from an external source. Focusing on change within treatment was necessary because it would be trivial to question whether the effect size between atypical antipsychotic medication and placebo was larger than the effect size between atypical antipsychotic medication and active comparator. Second, the placebo-controlled studies were generally conducted in North America whereas active- and low dose–controlled studies were predominantly conducted in Europe. Third, our results apply only to recent antipsychotic medication studies for schizophrenia. Analyses of possible control group bias should be conducted for other illnesses for which placebo controls are debated, such as depression, bipolar disorder, angina pectoris, asthma, hypertension, rheumatoid arthritis, peptic ulcer disease, type 2 diabetes mellitus, Parkinson disease, and Alzheimer disease.6,35,112- 132
By focusing on the design effect, we do not mean to minimize other difficulties with the use of historical controls. Even though cohort effects were shown not to account for the design effect demonstrated in the present analysis, unless historical placebo samples are carefully selected, shifts in patient characteristics over time101,111 could strongly confound comparisons of novel medications with historical placebo.32
The observed design effects on improvement could be termed “control group bias.” These results naturally lead to the question of which design yields the true improvement and which design yields the biased improvement. Unfortunately, the present study cannot provide an answer to this question. Selection effects and prescribing behavior considerations could be interpreted to suggest that placebo-controlled trials are biased toward smaller improvements within antipsychotic medication arms. Rater expectation considerations could suggest either that placebo-controlled trials are biased toward smaller improvements within antipsychotic medication arms or that active-controlled and low dose–controlled trials are biased toward larger improvements.
Our results have important implications for clinical trial design of future active-controlled equivalence or noninferiority studies. Improvement with a future novel treatment should be influenced by the design in the same manner as shown for the atypical antipsychotic medications in the current analysis. If the novel treatment in such studies is to be directly compared with external placebo, the design effect in the novel treatment arm will inflate the effect size of the comparison and will confound interpretation. The ability to interpret results could theoretically be restored by a valid method to correct either the novel treatment data or the external placebo data for the effect of the control group bias, but substantial statistical investigation will be required to develop and test such methods before they could be confidently applied. Inspection of the placebo- and low dose–controlled means in Figure 1 illustrates the difficulties involved. The difference in improvement between ineffective and intermediate doses in placebo-controlled vs low dose–controlled studies could potentially be modeled using an additive transformation or a ratio transformation, but neither of these transformations predicted the small difference in improvement between intermediate and effective doses in placebo-controlled vs low dose–controlled studies. Stated differently, any attempt to use effective dose data from placebo- and low dose–controlled studies and intermediate dose data from low dose–controlled studies to predict improvement at intermediate doses in placebo-controlled studies would overestimate that improvement.
These considerations also affect the interpretation of noninferiority or equivalence analyses when the margins are defined using external placebo-controlled data. For example, simple application of the noninferiority margin referencing the external placebo-controlled data to the novel treatment data in a noninferiority study assumes that it is valid to additively transform both the external placebo data and the novel treatment data by the difference in the standard treatment performance between current and external trials. As discussed earlier, however, an additive transformation cannot account for the current findings. Thus, the current data suggest that caution should be used when defining noninferiority margins based on external placebo-controlled trials of a standard antipsychotic medication.
The observed control group bias indicates that the constancy assumption does not hold in recent antipsychotic medication trials. These results suggest that active-controlled equivalence or noninferiority studies including comparisons with external placebo cannot be confidently substituted for placebo-controlled trials in studies seeking to establish efficacy of new medications for schizophrenia.
Although we are not aware of suggestions that data from low dose–controlled trials be compared with external placebo, low dose–controlled trials can have difficulty demonstrating that doses believed to be effective are superior to the internal low-dose controls. Given this difficulty, some investigators may consider attempting comparisons with external placebo. Since the present results suggest that improvement at higher doses is similar in placebo- and low dose–controlled trials, comparisons with external placebo at these doses may be affected relatively little by design effects. In the lower dose ranges, however, the implications of the current data for low dose–controlled trials are similar to those for active-controlled trials.
Our results have implications for the interpretation of dose-response relationships across trial designs. These implications are particularly important at marginally effective doses where differences in the magnitude of the improvement estimate could lead to differences in categorization of particular doses as effective or ineffective for clinical practice. In this analysis, doses that were not effective or were marginally effective in placebo-controlled trials produced significantly and substantially larger improvements in low dose–controlled studies. When placebo-controlled studies are used as the basis for separating effective doses from ineffective doses, the present data suggest a caveat that the design may underestimate improvement at marginal doses. Where low dose–controlled studies are relied upon, a similar caveat that the design may overestimate improvement at marginal doses is suggested.
Correspondence: Dr Woods, Treatment Research Program, Department of Psychiatry, Yale University School of Medicine, 34 Park St, New Haven, CT 06519 (firstname.lastname@example.org).
Submitted for Publication: June 7, 2004; final revision received November 22, 2004; accepted December 16, 2004.
Financial Disclosure: Dr Woods has received grants from Eli Lilly and Co, Indianapolis, Ind, Janssen Pharmaceutica Products, Titusville, NJ, and Bristol-Myers Squibb Co, New York, NY.
Funding/Support: This study was supported by grants MH54446 and MH57292 from the US Public Health Service, Washington, DC.
Previous Presentations: This study was presented at the Winter Workshop on Schizophrenia; February 28, 2002; Davos, Switzerland; and at the Eastern North America Region Meeting of the International Biometric Society; March 31, 2003; Tampa, Fla.
Acknowledgment: The assistance of the following individuals in obtaining unpublished data is gratefully acknowledged: Karen M. Corr, PharmD, and Lewis E. Warrington, MD (Pfizer Inc, New York, NY; studies 104, 114, 115, R-0548, and 302), Martin Dossenbach, MD, and Bruce Kinon, MD (Eli Lilly and Co; studies HGCQ and HGDT and study HGCJ, respectively), Donald C. Goff, MD (Harvard Medical School, Boston, Mass; ziprasidone study 101), Ali Saffet Gonul, MD (Eriyes University School of Medicine, Kayseri, Turkey; study KPA), and Emma K. Westhead (AstraZeneca Pharmaceuticals, Wilmington, Del; studies 0007 and 0014).