Performance on logical memory (for stories) for the patients with first-episode schizophrenia (FES) treated with risperidone and olanzapine and the healthy controls (HCs) at baseline, 6 weeks, and 16 weeks. The graph is representative of most tests because the difference between the 2 FES groups is small, the rate of improvement between the FES and HC groups is similar, and the difference between the FES and HC groups persists throughout the study. Error bars indicate SEMs.
Goldberg TE, Goldman RS, Burdick KE, Malhotra AK, Lencz T, Patel RC, Woerner MG, Schooler NR, Kane JM, Robinson DG. Cognitive Improvement After Treatment With Second-Generation Antipsychotic Medications in First-Episode SchizophreniaIs It a Practice Effect?. Arch Gen Psychiatry. 2007;64(10):1115-1122. doi:10.1001/archpsyc.64.10.1115
Cognitive impairment in schizophrenia is frequent, involves multiple domains, and is enduring. Numerous recent clinical trials have suggested that second-generation antipsychotic medications significantly enhance cognition in schizophrenia. However, none of these studies included healthy controls undergoing repeated testing to assess the possibility that improvements might reflect simple practice effects.
To report the results on cognition of a randomized comparison of 2 widely prescribed second-generation antipsychotic medications, olanzapine and risperidone, in patients with first-episode schizophrenia and a healthy control group.
Randomized clinical trial.
Hospital-based research units.
A total of 104 participants with first-episode schizophrenia and 84 healthy controls.
Main Outcome Measures
Cognitive assessment of all study participants occurred at baseline, 6 weeks later, and 16 weeks later. Neurocognitive tests included measures of working memory and attention, speed, motor function, episodic memory, and executive function.
No differential drug effects were observed. Of 16 cognitive measures, 9 demonstrated improvement over time and only 2 demonstrated greater rates of change than those observed in the healthy control group undergoing repeated assessment. The composite effect size for cognitive change was 0.33 in the healthy control group (attributed to practice) and 0.36 in the patients with first-episode schizophrenia. Improvements in cognition in the first-episode schizophrenia group could not be accounted for by medication dose, demographic variables, or intellectual level.
The cognitive improvements observed in the trial were consistent in magnitude with practice effects observed in healthy controls, suggesting that some of the improvements in cognition in the first-episode schizophrenia group may have been due to practice effects (ie, exposure, familiarity, and/or procedural learning). Our results also indicated that differential medication effects on cognition were small. We believe that these findings have important implications for drug discovery and the design of registration trials that attempt to demonstrate cognitive enhancement.
Cognitive impairment in schizophrenia is frequent, involves multiple domains of information processing, and may be a core feature of the disorder.1- 4 Thus, neurocognition has come to be viewed as a key target in clinical trials.5 Patients with first-episode schizophrenia (FES) may be an especially important group for such studies because they have demonstrable plasticity in symptomatic response to antipsychotics, can be tested while drug naive, and do not have long histories of multiple antipsychotic drug treatment that may confound results. Furthermore, potential confounders associated with chronicity (patient role, institutionalization, interactions with aging, or disease processes) are minimized.
Naturalistic studies of patients with FES demonstrate that they have substantial neurocognitive impairments of −1.0 to −2.0 SDs below average across a wide range of domains, including working memory, attention, processing speed, and episodic memory.6- 8 Longitudinal improvements are often considered modest after treatment in patients taking antipsychotic agents who are followed up for several years.9- 12 Several recent meta-analyses13,14 have suggested that second-generation antipsychotics (SGAs) improve cognition. Large industry-sponsored controlled trials that examined risperidone or olanzapine in patients with FES found significant improvement from baseline with the SGAs; effect sizes ranged from 0.35 to 0.54 on composite measures of cognition.15- 18 Critically, these studies did not include control groups, raising the possibility that improvements were due to practice effects because patients were tested on multiple occasions. Additionally, the proportion of non–drug-naive patients at baseline in these trials was rather high, raising the possibility that effects were due to medication withdrawal and/or switching from a drug that has adverse effects on cognition.
Herein, we report the results of a randomized comparison of 2 of the most widely prescribed SGAs in the United States, olanzapine and risperidone (thereby increasing the generalizability of the result), in patients with FES. Our randomized clinical trial addressed the issues raised by prior work: (1) inclusion of a healthy control group to assess practice effects, (2) inclusion of a high percentage of patients who were drug naive at baseline, and (3) federal sponsorship.19,20
Demographic information about the olanzapine, risperidone, and healthy control groups is given in Table 1. Of the 104 patients with FES, 80 were never exposed to medication, 14 had less than 1 week of antipsychotic exposure, and 10 had more than 1 week of antipsychotic exposure. Seventy-six patients had been diagnosed as having schizophrenia, 10 as having schizoaffective disorder, and 18 as having schizophreniform disorder as determined by the Structured Clinical Interview for DSM-IV.21 All patients were actively psychotic when they entered the study. Eighty-four healthy adults recruited from the community by advertisement or word of mouth served as controls (HCs). All HCs underwent the Structured Clinical Interview for DSM-IV and did not have any diagnosable Axis I disorders. Exclusion criteria for both groups included medical conditions known to affect the central nervous system and neurologic conditions or receiving drugs known to affect cognition.
The trial design has been presented in detail elsewhere.22 Patients with FES, schizoaffective disorder, or schizophreniform disorder were assessed at baseline and randomly assigned to treatment with olanzapine (2.5-20 mg/d) (n = 51) or risperidone (1-6 mg/d) (n = 54) for 16 weeks. Psychopathologic and cognitive assessments were performed by masked (blinded) assessors. The mean ± SD modal doses were 12.6 ± 6.0 mg/d for olanzapine and 3.7 ± 1.9 mg/d for risperidone. Patients with FES and the HC group received cognitive assessments at baseline (when most patients with FES were drug free) and after 6 and 16 weeks.
The cognitive tests listed in Table 2 were administered to all study participants. They included measures of processing speed, episodic memory, working memory, executive function, and motor speed and dexterity. In the FES group at baseline, 97 to 101 of patients received all tests with the exception of the Wisconsin Card Sorting Test (WCST) (n = 73), the California Verbal Learning Test (CVLT) (n = 18), and the Continuous Performance Test identical pairs (CPT-IP) and Delayed Match to Sample Test (DMS) (not administered); at 6 weeks, 75 to 79 patients had received all tests with the exception of the CPT-IP and DMS (n = 49); and at 16 weeks, 70 to 72 patients had received all tests with the exception of the CPT-IP and DMS (n = 49). In the HC sample at baseline, 79 to 84 individuals had received all tests with the exception of the CPT-IP and DMS (not administered); at 6 weeks, 59 to 61 individuals had received all tests with the exception of the CPT-IP and DMS (n = 41); and at 16 weeks, 54 to 55 individuals had received all tests with the exception of the CPT-IP and DMS (n = 41).
The Schedule for Affective Disorders and Schizophrenia–Change Version plus psychosis and disorganization items was used to rate severity of hallucinations and delusions (combined into a positive symptom dimension) as well as disorganization in speech (understandability) and bizarre behavior (combined into a disorganized dimension).26 The Schedule for the Assessment of Negative Symptoms (Hillside version) was used to rate negative symptoms.27
Repeated-measures analyses of variance that used mixed models to minimize the effects of missing data (Proc Mixed; SAS28) and in which covariance patterns were unstructured examined longitudinal changes and between-group effects in cognitive measures. Factors in the models were group, time, and group × time interactions. Significance was set at P <.003 after Bonferroni correction for multiple comparisons (16 tests) in the initial set of mixed-model repeated measures.
Our approach to the interpretation of statistical results follows:
A medication type (olanzapine or risperidone) × time interaction would suggest a differential medication effect.
Main effects of treatment week (time) would indicate improvement due to medication or other causes (ie, practice effects).
To disambiguate the possibilities in item 2, we would then compare performance of the patient groups with that of an HC group for whom data from multiple assessments were also collected. Group (patients with FES and HCs) × time interactions that favored steeper improvement in the FES group would be viewed as evidence of a drug effect, reflecting cognitive enhancement. A main effect for time in the absence of such an interaction could be viewed as representing practice effects.
In a set of secondary analyses, we attempted to determine if cognitive change could be attributed to causes other than drugs or practice using multiple regression that involved a variant of stepwise selection (MAXR in the Proc Reg module of SAS28). Thus, we sought to determine if baseline state variables or baseline to week 16 change measures (positive, negative, or disorganized symptoms) predicted cognitive change (ie, if changes could be due to pseudospecificity and simply reflect antipsychotic effects on symptoms). To ascertain if systematic dose effects on cognitive change scores were present, we also constructed a series of linear regressions within each of the medication groups in which modal antipsychotic dose served as the independent measure and cognitive change scores served as dependent measures. For multiple regressions, we relaxed significance of the equation to P <.05 and set significance for entry of predictors at P <.10.
In a preliminary analysis, we sought to determine if baseline differences between the medication groups were present. Study participants assigned to olanzapine and risperidone did not differ on any cognitive measure at study entry (Table 3), as indicated by both nonsignificant medication group effects and post hoc t test between-group effects at baseline.
We turned to the possibility that olanzapine and risperidone had differential impact on cognitive improvement, as would be indicated by medication group × treatment week (time) interaction. No such interactions were observed for any variable (Table 3).
We then sought to determine if performance on cognitive measures improved irrespective of medication type (ie, if a treatment week main effect were present). We observed such effects for most variables (Table 3). Only fluency, digit span, CPT-IP, DMS, and the motor tasks did not demonstrate change over time.
Given the aforementioned results, we sought to determine if improvements over time were equivalent to or greater than practice effects in HCs who were also tested on 3 occasions. In these analyses, we collapsed the olanzapine and risperidone groups into a single FES group (given the dearth of differential drug effects). We examined those 9 variables that had previously demonstrated improvement over time. Significant interactions (P <.006 for all) were revealed for 3 measures (Table 4). Rate of improvement was greater in the schizophrenia group than the HC group for memory for visual designs and trail making, suggesting that drug effects were larger than practice effects. For the Mini-Mental State Examination (MMSE), a clear ceiling effect in the HC group artifactually produced an interaction. For the WCST and CVLT trials 1 through 5, improvement was greater in the HC group, but the P values of the interaction were .01 and .03, respectively, which we suggest are at trend levels. For the remaining 5 variables only a main effect of treatment week was found, suggesting that improvement was no greater than changes mediated by practice effect.
The Figure illustrates the respective slope of improvement of the FES groups and the HC group for logical memory. The magnitude of improvement was often similar among the groups, and the 2 FES groups demonstrated nearly identical patterns of change for most measures. All cognitive performances in patients with FES, irrespective of whether or not they demonstrated improvement, were significantly below those of the control group even after premorbid intellectual ability, as measured by the Wide Range Achievement Test 3, served as a covariate in the mixed-model analyses (FES group vs HC group main effects: P <.001 for all entries except 1).
The composite effect size (Cohen d) in the FES group (from baseline through week 16) for the 16 measures initially examined was 0.36. However, for the HC group the Cohen d was similar (0.33). The effect size inspection (Table 5) indicated that improvements in cognition were larger in treatment weeks 0 to 6 than 6 to 16 in both groups. (For the sake of context, we also included effect sizes of key positive, disorganized, and negative symptom dimensions in Table 5.) Performance of the groups on each test at each time point is given in eTable 1 (http://terrygoldberg.net/Documents/msfe_supplement_tables.doc).
We sought to determine if otherwise equivalent change scores in the FES and HC groups were an artifact of differences in premorbid intellectual function (as measured on the WRAT reading subtest1,2) so that degree of change was masked in the FES group. We performed a series of linear regressions in which change scores for those variables for which a significant time effect was present (Table 4) served as dependent measures and the WRAT reading score and 2 demographic variables, sex and age, on which the HC group differed from the FES group, served as independent predictors. No independent variable entered significantly in any of these equations, suggesting that results were not due to preexisting intellectual or demographic differences that affected the slope.
We next examined the association of clinical symptoms at baseline and clinical symptom improvement (from baseline to 16 weeks) with cognitive changes in the FES group. We restricted our analyses to variables for which we had data at all 3 assessment points. Symptoms proved to be rather weak and inconsistent predictors of cognitive change, irrespective of whether the cognitive measure demonstrated changes of large magnitude or small magnitude at the group level. No R2 exceeded 0.08 for any independent measure or set of measures, as can be seen in eTable 2 (available at: http://terrygoldberg.net/Documents/msfe_supplement_tables.doc).
Last, the mean modal dose of SGAs was not predictive of change in the cognitive variables used in the prior multiple regressions (ie, WCST, visual reproduction, trail making, CVLT measures, memory for stories, MMSE, digit symbol, or line orientation). P values were between .11 and .97. All R2 values were below 0.09. Samples for all these analyses were generally between 31 and 39 patients.
Before discussing key results, it is important to appreciate that the sample of patients with FES studied herein appears to be representative, both in terms of demonstrating rather widespread cognitive impairments at baseline and in experiencing large reductions in positive and disorganized symptoms with antipsychotic treatment.22 In our study, we observed highly significant improvements in cognitive performance. These improvements occurred in tests of speed of processing that required psychomotor function (trail making or digit symbol), consistently in verbal and visual tests of episodic memory, and in executive functions related to set shifting.
Our results indicate that differential medication effects on cognition were small. Although these results suggest that the choice of initial SGA should not be based on presumptive cognitive advantages for one or the other drug early in schizophrenia, we recognize that many other factors can weigh in the selection of one drug or another (eg, antipsychotic efficacy). These results are consistent with several recent reports of direct comparison of olanzapine and risperidone that did not observe differential effects in long-term samples.29,30 Practice effects were not examined in these studies.
To the best of our knowledge, this is the first controlled clinical trial that includes multiple assessments of an HC group to (1) assess the presence of practice effects and their magnitude in an HC group and (2) determine if drug effects were greater than practice effects in the group of patients with FES. In 2 instances the performance gains of patients with FES exceeded the practice effects in HCs: episodic memory for visual designs and trail-making attention and speed (MMSE was not considered, since a ceiling effect was present in healthy controls). Because the effects could not be attributed to high covariation with positive or negative symptoms, these effects might be considered to represent valid cognitive enhancement. However, most variables did not demonstrate rates of improvement above and beyond practice effects: verbal episodic memory, visual spatial processing, card sorting and set shifting, or digit symbol coding speed. Several other variables did not demonstrate any change (eg, verbal fluency, digit span, CPT-IP, or DMS). It is sobering to note that the composite effect size in the FES group of 0.36 would be considered moderate and could be attributed to treatment; only when it is compared with the effect size in the HC group (0.33) does it become clear that the magnitude of the effect is in keeping with a practice-related phenomenon. Additionally, effect sizes for cognition in the group of patients with FES were larger in the 0- to 6-week period than in the 6- to 16-week period, which is consistent with the pattern of practice effects in the HC group.
Other naturalistic studies10- 12 of cohorts of patients with FES have included HCs who underwent serial cognitive testing in parallel designs. Although these studies necessarily did not control medication regimens (and therefore included first-generation antipsychotics and SGAs) and had variable numbers of drug-naive patients at baseline testing, as well as smaller sample sizes than the present study, cognitive results were strikingly comparable to those reported herein. Broadly, these studies found that patients with FES continued to demonstrate cognitive impairments of −1 to −2 SDs during 2- to 5-year periods compared with HCs. Thus, even when patients with FES made gains, so did HCs (presumably on the basis of practice effects), who thereby maintained their advantage.
Our results were also broadly consistent with a meta-analysis of SGA effects on cognition14; we found that verbal fluency and measures of simple working memory demonstrated smaller effect sizes than tests of either learning or psychomotor speed. However, in our study some of the larger effects in verbal learning and speed of processing (eg, digit symbol coding) appeared to be consistent with practice effect gains, not cognitive enhancement per se.31 The large Clinical Antipsychotic Trials of Intervention Effectiveness study,32 which involved a somewhat complicated switching design, has found smaller effects of SGAs. Thus, our study may have implications for the reinterpretation of several previous trials in which cognition improved because the magnitudes of improvement found in prior trials may not be greater than the practice effect demonstrated by the HC group in our study.
We considered other explanations besides practice effects for the cognitive improvements observed in our patients with FES. Improvements in symptoms could not account for cognitive change, although our use of statistical control methods was admittedly not optimal.5 Similarly, demographic variables and premorbid level of intellectual function did not predict slope. Moreover, the relationship of the magnitude of cognitive change scores and medication dose was close to nil in patients. Although these results further strengthen the case that change was due to practice and not induced by medication, demographic, or state variables, they do not prove it.
On the face of it, practice effects may be advantageous clinically. Many activities in daily life rely on practice or repetition for optimizing performance. However, little evidence indicates that such types of improvement will generalize to other tasks33 because a practice effect is paradigm specific (eg, familiarity with testing instructions and demands) or item specific (eg, words on a list). In the present context, practice effects may not reflect change in the compromised neurobiological function of schizophrenia that would then effect improvement in broad domains of cognition. Furthermore, practice effects will not compensate for baseline differences, since patients who start lower than controls also end lower than controls (who are also practicing) despite improvement (as can be seen in the Figure). Several lines of evidence suggest that patients with schizophrenia are capable of demonstrating a practice effect, including intact consolidation. Thus, patients do not demonstrate markedly accelerated rates of forgetting; they retain what they can encode.34,35 To the extent that some of the possible practice effects observed in patients with FES may have been mediated by procedural learning, this type of learning is thought to be relatively intact in schizophrenia.36 Additionally, practice effects might engage subtly different neural systems than those involved in initial task performance.37
The magnitude of practice effects on verbal memory and speed measures observed in the HC group is similar to that reported in the literature.38- 40 Several tests used in our study demonstrated minimal practice effects and had several commonalities: simple directions, large numbers of trials, and a restricted set of stimuli, resulting in little distinctiveness among individual instances. These tests were associated with minimal improvement in both the FES and HC groups.
Several studies in patients with FES have discerned differences in cognitive improvement between an SGA and a first-generation comparator. Beyond possible sources of bias in design, drug dose, or analysis in these trials,19 we note that (1) the effect sizes in the SGA groups in this study were in keeping with the practice effects discerned in HCs and (2) first-generation drugs might retard a practice effect.41- 43 Thus, we would suggest that these results do not indicate cognitive enhancement beyond that found in practice effects. Whether SGAs permit or attenuate the development of practice effects is an intriguing question but one that cannot be resolved fully by this study, since practice effects in the drug-free state in the FES sample cannot be assessed because of ethical issues. Similarly, it might be argued that if 1 patient group demonstrated larger cognitive changes than another after receiving an adjunctive medication, this would indicate prima facie that cognitive enhancement had occurred. However, under certain circumstances an HC group might still be necessary because of the possibility that the base antipsychotic might suppress a practice effect, whereas the adjunctive exerts a compensatory effect. Because improvement would be no greater than that seen in an HC group, it might be necessary to provide a context or metric for rationally calibrating magnitude of improvement.
Our approach to pseudospecificity was admittedly imperfect, and it is possible that general improvement in symptom status, organization, and general test-taking behavior accounted for some of the change. Such a result would not be inconsistent with our general notion that SGAs may not directly enhance cognition to the degree previously thought. Furthermore, the possibility that haloperidol at relatively high doses suppressed practice effects in some earlier studies43 that compared cognitive change induced by first-generation antipsychotics and SGAs is consistent with our argument that changes we observed may be in part due to practice effects, given the relatively low doses of SGAs used in this study (ie, doses unlikely to interfere with a practice effect).
Minimizing practice effects may not be easy. Use of a different design to reduce practice effects, such as a crossover with counterbalancing or serial testing during a lead-in period, are not without pitfalls.44 Alternate test forms may attenuate but not eliminate practice effects.45,46 It might also be possible to develop tasks based on the criteria we outlined herein (eg, multiple trials, restricted stimulus set, and high interference) that are relatively resistant to practice effects.
This study has several limitations. We could not assess practice effects in drug-free individuals because of concerns about human subjects. Thus, our results should be viewed as inferential. Although we have tried to provide compelling circumstantial evidence in favor of our account, it nevertheless remains circumstantial. Since we cannot say with absolute certainty what proportion of gain is associated with practice effects, it is probably fair to say that practice effects should at least be considered when interpreting cognitive improvement in clinical trials.
Also, the generalizability of our study to patients with more chronic conditions is unclear. We do not think that our results extend to recent and select cognitive rehabilitation programs that use drills, reinforcement, and meta-cognitive instruction, since they have sometimes resulted in generalizable improvements and vocational successes.47,48 Occult ceiling and floor effects may have been present in some of our data. Future studies might use novel item response theory to minimize such problems. Of course, item response theory–based analyses would in no way preclude findings of improvement in true scores due to exposure, test-taking strategy, and efficiency of response, as we believe might have occurred in the present study. Last, the WRAT reading standard scores (a putative measure of premorbid intellectual function) were not well matched between the HC and FES groups. We carefully addressed this in a series of multiple regressions in which reading level, age, and sex served as independent measures and cognitive change scores as dependent measures. We found no evidence that reading level, sex, and age were significant predictors of the magnitude of cognitive gain, irrespective of group. Results from such regressions are generally considered more transparent than analysis of covariance statistics.49
We wish to emphasize that individual patients may also benefit cognitively from SGAs. Determining the clinical and genetic characteristics of these individuals may be important for future research.
We did not have baseline data for several measures (eg, CPT-IP and DMS). This lack of data prevents us from commenting on change from drug-naive to treated status in the FES groups for these tests. However, we were struck by the lack of change from first exposure to second exposure in the HC group, suggesting that the tests themselves may not be prone to practice effects.
Our findings may also have implications for drug discovery and regulatory approval of new antipsychotic medications, including inclusion of an HC control group if cognitive change is being measured. In some circumstances, our findings may have implications for the design of trials that use cognitive enhancing adjunctive agents, such as cases in which it becomes important to calibrate the magnitude of change to some external metric (ie, a practice effect in an HC group). We recognize that our study cannot fully disambiguate contributions to cognitive change due to practice effects, pseudospecificity, or drug-induced cognitive enhancement. Nevertheless, we hope that our findings increase awareness of practice effects as a potential source of cognitive change in clinical trials and that our findings can be used heuristically in the development of study designs and tests that are relatively insensitive to practice-related changes. Such advances may be important for improving methods involved in the assessment of cognitive change in clinical trials.
Correspondence: Terry Goldberg, PhD, Zucker Hillside Hospital, 7559 263rd St, Glen Oaks, NY 11004 (firstname.lastname@example.org).
Submitted for Publication: November 1, 2006; final revision received March 2, 2007; accepted March 19, 2007.
Financial Disclosure: Dr Goldman currently works for Pfizer. At the time the study was designed and initiated and the data were collected, he was at Zucker Hillside Hospital. Dr Kane is a consultant for Abbott, BMS, Pfizer, Janssen, and Lilly and lectures for BMS and Janssen. Dr Schooler serves on advisory boards for Janssen amd BMS and has received unrestricted educational grants from AstraZeneca, BMS, Eli Lilly, Janssen, and Pfizer. Dr Robinson lectures for Janssen and has received funding for an investigator-initiated grant from Lilly.
Funding/Support: The study was supported by grants MH60004, NIDA K23 DA015541, MH41960 (The Zucker Hillside Center for Intervention Research in Schizophrenia), and RR018535 (Feinstein Institute for Medical Research General Clinical Research Center) from the National Institutes of Health.
Additional Information: The eTables are available at http://terrygoldberg.net/Documents/msfe_supplement_tables.doc.
Additional Contributions: The following individuals helped with various facets of this study: Denise Coscia, MA, Adrianna Franco, MA, Handan Gunduz-Bruce, MD, Faith Gunning-Dixon, PhD, Ali Khadivi, PhD, Beth Lorell, LCSW, Joanne McCormack, LCSW, Alan Mendelowitz, MD, Rachel Miller, LCSW, Barbara Napolitano, MS, Gail Reiter, MA, Serge Sevy, MD, and Jose Soto-Perello, MD.