FDA indicates the US Food and Drug Administration.
The standardized mean change was significantly positively correlated with year of publication for placebo arms (Spearman r = 0.52, n = 39, P = .001) and was significantly negatively correlated with year of publication for the effective dose medication arms (Spearman r = −0.26, n = 208, P < .001), but not for the low-dose medication arms (Spearman r = 0.32, n = 25, P = .12) or the intramuscular medication arms (Spearman r = −0.14, n = 24, P = .53).
Rutherford BR, Pott E, Tandler JM, Wall MM, Roose SP, Lieberman JA. Placebo Response in Antipsychotic Clinical TrialsA Meta-analysis. JAMA Psychiatry. 2014;71(12):1409-1421. doi:10.1001/jamapsychiatry.2014.1319
Copyright 2014 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.
Because increasing placebo response rates decrease drug-placebo differences and increase the number of failed trials, it is imperative to determine what is causing this trend.
To investigate the relationship between antipsychotic medication and placebo response by publication year, and to identify associated study design and implementation variables.
MEDLINE, PsycINFO, and PubMed were searched to identify randomized clinical trials of antipsychotic medications published from 1960 to July 2013.
Included were randomized clinical trials lasting 4 to 24 weeks, contrasting antipsychotic medication with placebo or an active comparator, and enrolling patients 18 years of age or older with schizophrenia or schizoaffective disorder.
Data Extraction and Synthesis
Standardized mean change scores were calculated for each treatment arm, plotted against publication year, and tested with Spearman rank correlation coefficients. Hierarchical linear modeling identified factors associated with the standardized mean change across medication and placebo treatment arms.
Main Outcomes and Measures
We hypothesized that the mean change in placebo-treated patients would significantly increase from 1960 to the present, that a greater change would be observed in active comparator vs placebo-controlled trials, and that more protocol visits would increase the symptom change observed.
In the 105 trials examined, the mean change observed in placebo arms increased significantly with year of publication (n = 39, r = 0.52, P = .001), while the mean change in effective dose medication arms decreased significantly (n = 208, r = −0.26, P < .001). Significant interactions were found between assignment to effective dose medication and publication year (t260 = −5.55, P < .001), baseline severity (t260 = 5.08, P < .001), and study duration (t260 = −3.76, P < .001), indicating that the average drug-placebo difference significantly decreased over time, with decreasing baseline severity and with increasing study duration. Medication treatment in comparator studies was associated with significantly more improvement than medication treatment in placebo-controlled trials (t93 = 2.73, P = .008).
Conclusions and Relevance
The average treatment change associated with placebo treatment in antipsychotic trials increased since 1960, while the change associated with medication treatment decreased. Changes in randomized clinical trials leading to inflation of baseline scores, enrollment of less severely ill participants, and higher expectations of patients may all be responsible.
Placebo response rates in trials of antipsychotic medications for acute schizophrenia are increasing, mirroring the increases observed in illnesses such as major depressive disorder.1,2 In 28 atypical antipsychotic trials, Kemp et al3 reported that the average symptom improvement among patients assigned to placebo increased by over 10 points on the Positive and Negative Syndrome Scale between 1991 and 2006. Agid et al4 found that the standardized mean effect size associated with placebo treatment significantly increased over time. Younger age, shorter illness duration, greater baseline illness severity, shorter trial duration, and more study sites were associated with greater placebo response.4
Increasing placebo response rates contribute to decreasing drug-placebo differences and increasing numbers of failed antipsychotic trials,5 both of which increase the cost of drug development, delay the clinical availability of new antipsychotic medications, and even precipitate reductions in pharmaceutical company research for psychiatric disorders.6 Thus, it is imperative to determine what is causing these increased placebo response rates in antipsychotic trials. The findings of Agid et al4 are of considerable value toward this goal, but their interpretation is constrained by their relatively narrow selection of available antipsychotic trials and is limited to examining mean treatment change in participants assigned to placebo. It also would be of great interest to investigate which trial designs and which clinical variables are associated with response to medication, and to determine whether there exist variable × treatment assignment interactions that directly bear on the observed drug-placebo difference.
Consequently, we sought to replicate and build on the results of Agid et al4 by investigating the causes of placebo response in a much larger selection of randomized clinical trials (RCTs) of antipsychotic medications using a statistical method (hierarchical linear modeling) permitting the analysis of all treatment arms (drug and placebo). Rather than examine all possible predictors of treatment change, we selected candidate explanatory variables based on an established model of the causes of placebo response.7 This model, proposed to describe causes of placebo response in antidepressant trials, divides its causes between (1) placebo effects based on patient expectancy of improvement and therapeutic contact with health care professionals, (2) measurement artifacts caused by rater bias and sampling error, and (3) spontaneous improvement and worsening of symptoms that are unrelated to the study procedures.
We were interested in determining to what degree this model can be generalized to different psychiatric disorders, such as schizophrenia, in order to differentiate illness-specific causes of placebo response from those common to the conduct of all RCTs. Patients with schizophrenia have baseline cognitive difficulties that may be exacerbated during periods of acute psychosis, which may impair their capacity to form treatment expectancies based on information provided during the informed consent process.8 Moreover, it is unclear whether contact with health care professionals has a therapeutic effect on patients with schizophrenia comparable to that observed in patients with depression because a cardinal feature of schizophrenia is a disturbance in interpersonal relationships.9 In contrast, a development common to both antidepressant and antipsychotic drug trials has been the advent of larger, multicenter trials in which investigators may have a greater financial incentive to enroll patients.9 This general change in RCT design may result in the inflation of baseline scores and increased placebo response rates across disorders.
In line with the results of prior analyses, we hypothesized that the mean pre-post change observed in placebo-treated patients in RCTs of antipsychotic drugs would significantly increase from 1960 to the present and would be positively associated with sample size. Similar to findings from RCTs of antidepressants, we anticipated that a greater mean change would occur during medication treatment in active comparator vs placebo-controlled trials (owing to the increased expectation of improvement induced by receiving a known active treatment) and in trials with a greater intensity of therapeutic contact (as measured by the prescribed number of protocol visits).
MEDLINE, PsycINFO, and PubMed were searched to identify RCTs published between 1960 and July 2013 that contrasted antipsychotics to placebos or active comparators in adults with schizophrenia or schizoaffective disorder (Figure 1; see “Search Strategy” in the eAppendix in the Supplement). Inclusion criteria stipulated that articles report an RCT of an antipsychotic medication for schizophrenia or schizoaffective disorder that was approved by the US Food and Drug Administration for adults. Further criteria required trials to last between 4 and 24 weeks (inclusive), have a comparison group that received placebo or another antipsychotic medication approved by the US Food and Drug Administration, be written in English, be published in 1960 or later, and have symptom change measured using a standardized outcome measure. Trials were excluded for enrolling treatment-resistant patients or for requiring as inclusion criteria specific symptoms, a specific medical illness, or an Axis I disorder other than schizophrenia or schizoaffective disorder.
Study information such as the year of publication, sample size, and presence of a lead-in period in addition to the clinical and demographic characteristics of participants, details of the treatment conditions, duration of active treatment in each study, and the number of study visits were entered into a database. We started counting the number of visits prescribed in each study with the initiation of treatment (ie, we began with the week 1 visit and did not count evaluation or screening appointments). We distinguished treatment arms using flexible dosing or fixed but effective doses of oral antipsychotic medication from treatment arms using doses of medication believed to be too low to have an important effect (see “Data Extraction” in the eAppendix in the Supplement). We also separately classified medications administered intramuscularly in long-acting injectable formulations in order to test whether the route and frequency of administration influenced the treatment change observed.
Because individual studies used different scales to measure pre-post change, it was necessary to standardize the change scores published for each treatment condition in the studies comprising our sample. Our primary method was to calculate a standardized mean change score by dividing the pre-post mean difference by the number of total points possible on the scale used. For example, the standardized change for a treatment arm in which participants improved by 15 points on the 18-item Brief Psychiatric Rating Scale (BPRS, with each item rated 0-6) was calculated to be 15/108 = 0.139. As discussed in “Search Strategy” in the eAppendix in the Supplement, we also explored alternative methods for calculating standardized mean change.
Differences in study characteristics, patient demographics, and clinical features across the different study types were investigated using 2-tailed independent-samples t tests for continuous variables and χ2 tests for categorical variables (SPSS version 21 [IBM Corp]). To investigate the trajectory over time of the mean change observed in placebo, effective dose, and low-dose medication groups, we tested Spearman rank correlation coefficients for the relationship between standardized mean change and publication year. Plots of the rank order of standardized mean change are shown, and the Spearman (rather than the Pearson) correlation is used to reduce influence from any large values from any one study.
To identify factors significantly associated with the standardized mean change observed in the treatment arms within our sample, we used a hierarchical linear modeling (HLM) approach10- 12 similar to the one that we successfully implemented in several prior studies, in which the procedures are described in greater detail.13- 15 This approach entails first examining the heterogeneity in treatment outcome within the sample using an unconditional model that contains a single level 1 (ie, within each study) equation describing the mean change in each treatment arm as equal to a constant. At level 2 (ie, between studies), this constant can be described as varying around a grand mean with error. The H statistic (H = √[χ2/(df − 1)]) can be used to measure this variability in treatment change, approximating 1 when there is only random variation between studies and progressively exceeding 1 when the results of a set of studies lack homogeneity.16 The I2 statistic (I2 = [H2 − 1]/H2) describes the proportion of total variation in treatment change that is attributable to heterogeneity.17
If there was significant variability in mean change across studies (ie, when the 95% CI for H did not include 1), we attempted to explain this variability by means of our hypothesized within-st udy and between-study variables. This yielded the final mixed model containing 8 level 1 predictors, 7 level 2 predictors, and 9 cross-level interactions. All of the regression models were estimated using HLM version 6.08 (Scientific Software International).
A total of 105 studies met the inclusion and exclusion criteria (Table 1).18- 119 As shown in Table 2, these included 257 treatment groups (25 groups that received low-dose medications, 208 groups that received effective dose medications, and 24 groups that received intramuscular medications) comprising 21 892 participants and 39 placebo groups comprising 2882 participants. Sixty-six studies used an active comparator design, of which 4 (6.1%) demonstrated significant differences in pre-post change between treatment groups. Fifteen of 39 placebo-controlled studies (38.5%) demonstrated significant differences in mean change between medication groups and placebo groups. See “Clinical Characteristics of Included Patients and Methodological Features of Studies” in the eAppendix in the Supplement for a comparison of study and clinical characteristics between medication groups and placebo groups.
The rank order of the standardized mean change associated with low-dose medication groups, effective dose medication groups, and placebo groups are plotted against year of publication in Figure 2. The mean change observed in patients receiving placebo increased significantly with year of publication (Spearman r = 0.52, n = 39, P = .001). This correlation equates to an average increase in pre-post treatment change of 1.1 BPRS points (or 2.2 points on the Positive and Negative Syndrome Scale) per decade since 1960 for patients assigned to placebo. In contrast, the mean change in patients receiving effective dose medication decreased significantly over the same time period (Spearman r = −0.26, n = 208, P < .001), equating to a decrease of 2.0 BPRS points or 3.8 Positive and Negative Syndrome Scale points per decade since 1960. The mean change in patients receiving low-dose or intramuscular medication was not significantly associated with publication year (low-dose medication: Spearman r = 0.32, n = 25, P = .12; intramuscular medication: Spearman r = −0.14, n = 24, P = .53). We examine correlations between study variables and publication year in the eAppendix in the Supplement.
In the unconditional model of treatment change, variability was over 3 times greater than expected by chance alone (H = 3.3 [95% CI, 3.0-3.5]), and the proportion of variability in mean change caused by heterogeneity rather than random error was 91.8% (I2 = 0.918). Coefficients and accompanying statistical tests for the predictor variables in the final model of standardized mean change are presented in Table 3. The primary findings for level 1 main effects were that effective dose medication (t260 = 14.15, P < .001), low-dose medication (t260 = 4.65, P < .001), and intramuscularly administered medication (t260 = 2.99, P = .004) were each associated with significantly more change than placebo. There were no significant main effects of sample size (t260 = −0.40, P = .69) or standardized baseline severity (t260 = 1.13, P = .26) after accounting for other variables.
The relative benefit of effective dose medication over placebo decreased over time (publication year × effective dose interaction: t260 = −5.55, P < .001), which was not observed for low-dose or intramuscularly administered medication. Moreover, the relative benefit of effective dose medication over placebo decreased with increasing study duration (study duration × effective dose interaction: t260 = −3.76, P < .001). Baseline severity score significantly interacted with both low-dose medication (t260 = 2.30, P = .02) and effective dose medication (t260 = 5.08, P < .001). These interactions indicate that the average benefit of medication over placebo increased as baseline symptom severity increased.
With respect to between-study (level 2) variables, the mean treatment change significantly increased with year of publication (t93 = 2.38, P = .02), although the direction of this effect differed depending on treatment assignment, as already explained. Medication treatment in comparator study designs was associated with significantly more improvement than medication treatment in placebo-controlled trials (t93 = 2.73, P = .008), amounting to an average 3.8 BPRS point difference (controlling for other variables). The use of single-blind lead-ins significantly decreased the average pre-post treatment change observed (t93 = −2.29, P = .02), but there were no significant lead-in × low-dose interactions (t93 = −0.39, P = .70) or lead-in × effective dose interactions (t93 = −1.79, P = .07), suggesting that single-blind lead-ins decrease both medication and placebo response symmetrically. Overall, the final mixed model of standardized mean change significantly improved model fit over the unconditional model (χ224 = 242.0, P < .001) and explained 65.8% of the residual variability in mean change. Alternative methods of calculating standardized mean change did not change the overall pattern of results (see “Repeated Multilevel Models Using Different Methods of Calculating Standardized Mean Change” in the eAppendix in the Supplement).
In this meta-analysis of 105 trials of acute antipsychotic drugs for schizophrenia, the placebo response was shown to be significantly increasing from 1960 to the present (Spearman r = 0.52, P = .001). Strikingly, whereas the average placebo-treated patient in an RCT of antipsychotic drugs from the 1960s worsened by 3.5 BPRS points, by the 2000s, the average placebo-treated patient improved by 3.2 BPRS points. In contrast, the treatment change associated with effective dose medication significantly decreased over the same time period (r = −0.26, P < .001). The average RCT participant receiving an effective dose of medication in the 1960s improved by 13.8 BPRS points, whereas this difference diminished to 9.7 BPRS points by the 2000s. The consequence of these divergent trends was a significant decrease in drug-placebo differences from 1960 to the present (effective dose medication × publication year interaction: t260 = −5.55, P < .001).
Our analyses suggest that a change in the patient population enrolling in RCTs of antipsychotic drugs may contribute to the decreasing drug-placebo differences observed over time. We found that the benefit of effective dose medication over placebo was greater for more severely ill patients (baseline severity × effective dose interaction: t260 = 5.08, P < .001), yet the mean baseline severity in these RCTs appears to be decreasing over time (n = 282, r = −0.13, P = .03). A recent meta-analysis of placebo response in schizophrenia studies by Agid et al4 found that the number of study sites significantly increased from a median of 2 sites before 1990 to 38 sites in the 2005-2010 time interval. The same investigators found that the percentage of academic sites in antipsychotic drug trials decreased from 40% of the total before 1997 to 10% from 1997 onward.4 Although we did not directly test the relationship between the type and the number of study sites with placebo response, we did find that the standardized mean change for placebo-treated patients was significantly associated with sample size (Pearson r = 0.48, n = 39, P = .002). Because a larger sample size generally requires a greater number of study sites, this finding appears consistent with the findings of Agid et al.4
Another interesting finding in our meta-analysis was that medication treatment in comparator study designs was associated with significantly more improvement than medication treatment in placebo-controlled trials (t93 = 2.73, P = .008). One possible explanation for these results is that individuals with schizophrenia become aware of their probability of receiving active medication during the informed consent procedure in an RCT, and this information generates expectations of improvement that influence treatment response. Alternatively, the awareness of clinicians and raters of whether an RCT of antipsychotic drugs uses an active comparator or placebo control may influence treatment response. In some cases, this “investigator expectancy” may influence clinical decision making (eg, regarding dosage changes in flexible-dosing designs or whether criteria for early discontinuation are met) or may result in rater bias, thereby contributing to the observed outcome differences between study types.
Contrary to our hypotheses, the intensity of therapeutic contact (ie, the number of scheduled visits) did not significantly influence the pre-post treatment change observed or affect average drug-placebo differences in our sample of trials of antipsychotic drugs. One reason for this may be that much of the treatment provided in these trials occurs within an inpatient setting, in which all patients benefit from a therapeutic milieu and attention from health care providers regardless of how frequent their visits with research staff. Alternatively, it may be the case that the supportive provision of empathy, a coherent narrative to understand one’s illness, and a therapeutic relationship are less effective in the treatment of schizophrenia.
Comparing these findings with what is known about placebo response in other disorders such as major depressive disorder, we find that a picture of placebo response emerges in which a substantial portion is caused by general methodological factors rather than the nature of the illness under study (see “Relevance of Results for Clinical Trial Design” in the eAppendix in the Supplement for further discussion). For example, a common trend in both antidepressant and antipsychotic drug trials has been a shift over time from smaller, academic, single-site trials to larger, commercial, multicenter trials.4,120 These changes in RCT conduct have significant advantages, such as the increased statistical power conferred by larger samples. However, increased measurement error associated with multicenter clinical trials may lead to decreased effect sizes for medication and may partially offset the benefits of a larger sample size.121 Second, whereas academic investigators may be biased by their interest in a positive research outcome, commercial sites, particularly those operated by contract research organizations, have arguably more powerful financial incentives to enroll patients, which can result in the inflation of baseline scores by raters followed by a rapid decrease in scores once the restrictive entrance criterion has been passed.122 Some data suggest that using centralized raters may help control these potential biases and consequently improve inter-rater reliability, reduce biases toward inflation of baseline scores and observing improvement, and eliminate the effects of repeated assessments by the same clinician.123
A number of limitations should be considered when interpreting the findings of our study. There were relatively few placebo-controlled trials (n = 39) meeting our selection criteria, which limited the data available for analysis in our study. The use of trial-level summary data was another limitation because we were unable to test for associations between patient characteristics and the effects of study type and frequency of visits. In addition, publication bias may have affected which studies were included in these analyses because RCTs failing to demonstrate significant differences between medication and placebo may not have been published. Also, we determined the number of visits based on the designed visit schedule for each study rather than on the actual number of visits that each participant attended. Finally, the time period that we examined (1960-2010) encompassed significant changes in the diagnostic criteria used to identify patients (eg, the publication of DSM-III in 1980),124 and it is possible that the criteria used to define schizophrenia differed across studies and contributed to the results.
In summary, the results from our meta-analysis confirm that the placebo response rates have been increasing over time in antipsychotic drug trials, while the change observed in medication-treated patients has been decreasing. Possible causes include decreased baseline symptom severity of study participants and a change in the treatment settings within which studies are conducted. The methodological changes to improve signal detection in RCTs of antipsychotic drugs suggested by our meta-analysis would be to recruit more severely ill patients, limit study duration to no longer than 8 to 12 weeks, dispense with single-blind placebo lead-in periods, and maximize the probability of being assigned to placebo as opposed to active medication.
Submitted for Publication: January 3, 2014; final revision received May 27, 2014; accepted June 9, 2014.
Corresponding Author: Bret R. Rutherford, MD, Columbia University College of Physicians and Surgeons, New York State Psychiatric Institute, 1051 Riverside Dr, PO Box 98, New York, NY 10032 (email@example.com).
Published Online: October 8, 2014. doi:10.1001/jamapsychiatry.2014.1319.
Author Contributions: Dr Rutherford had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Rutherford, Wall, Roose, Lieberman.
Acquisition, analysis, or interpretation of data: Rutherford, Pott, Tandler, Wall, Lieberman.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: Rutherford, Wall, Lieberman.
Statistical analysis: Rutherford, Wall, Lieberman.
Administrative, technical, or material support: Pott, Tandler, Lieberman.
Study supervision: Roose, Lieberman.
Conflict of Interest Disclosures: Dr Lieberman serves on the advisory board of Intra-Cellular Therapies. He receives grant support from Allon, Biomarin, Eli Lilly, F. Hoffman–La Roche, Genentech, GlaxoSmithKline, Merck, Novartis, Pfizer, Psychogenics, Sepracor (Sunovion), and Targacept and holds a patent from Repligen. No other disclosures are reported.
Funding/Support: This work was supported by National Institutes of Mental Health grant K23 MH085236 (Dr Rutherford).
Role of the Funder/Sponsor: The National Institutes of Mental Health had no role in the design and conduct of the study; collection, management, analysis, or interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Correction: This article was corrected on November 13, 2014, to fix a typographical error in the text.