Two-year change from baseline in overall scores on the National Eye Institute Visual Function Questionnaire (NEI-VFQ)3 base set of 25 vision-targeted questions for matched pairs of sham and no-treatment controls who were interviewed at 2 years (42 pairs). Box-and-whisker plots summarize the distributions. Each box contains 50% of the values in the distribution; the horizontal line in each box indicates the median and the “x” indicates the mean. Whiskers extend beyond the top and bottom of each box by 1.5 times the interquartile range. Outliers are indicated by points beyond the whiskers. Separate distributions are shown of change scores for sham controls and no-treatment controls and for paired differences in change scores (sham control change score minus matched no-treatment control change score).
Hawkins BS, Bressler NM, Reynolds SM. Patient-Reported Outcomes Among Sham vs No-Treatment Controls From Randomized Trials. Arch Ophthalmol. 2011;129(2):200-205. doi:10.1001/archophthalmol.2010.359
To compare 2-year changes from baseline scores on the National Eye Institute Visual Function Questionnaire (NEI-VFQ) between similar participants assigned to sham and no-treatment control arms in randomized clinical trials of treatment of subfoveal choroidal neovascularization secondary to age-related macular degeneration.
We retrospectively matched sham controls from a randomized trial to no-treatment controls (no sham or placebo) from another trial on 7 baseline prognostic criteria. Two-year changes in overall and subscale scores were compared using data from those who had 2-year interviews and also using the last follow-up observation carried forward to impute missing 2-year interview scores.
A match to a no-treatment control on all 7 criteria was identified for 62 of 238 sham controls. Among the 42 matched pairs of controls interviewed at 2 years, no important difference in 2-year change in NEI-VFQ scores overall or by subscale was observed. Findings were similar for the 56 matched pairs of controls who could be analyzed for 2-year changes in scores using the method of last follow-up observation carried forward.
Findings from this retrospective matched-pairs analysis suggest that sham treatment to mask patient participants in clinical trials may be unnecessary when patient-reported outcomes are of interest and standard instruments are administered by interviewers masked to treatment assignment. This analysis, together with our earlier analysis of visual acuity outcomes, questions the necessity for sham (placebo) controls in randomized clinical trials in ophthalmology when other methods to minimize outcome assessment bias are incorporated into the design.
When no effective standard treatment is available to use as the control in a randomized trial of a new treatment, a sham (placebo or dummy) treatment often is used. Whether the sham treatment is an inactive placebo for ingestion or topical application, an inactive or disabled device, an injected inactive agent, or some other mock procedure without an intended therapeutic benefit, the cost and inconvenience of developing and administering the sham may be considerable. In our view, inadequate attention has been given to research to demonstrate when sham controls are necessary in such situations and when no-treatment controls are satisfactory. We are not aware of any clinical trial in ophthalmology in which patients in the control arm were randomly assigned to a sham arm or to a no-treatment arm.
Prompted in part by a Cochrane systematic review by Hróbjartsson and Gøtzsche,1 who “were unable to detect a statistically significant overall effect of placebo [sham] intervention in trials with binary outcomes . . . or in trials with continuous outcomes reported by observers,”1(p8) in an earlier investigation we compared visual acuity outcomes between matched pairs of sham and no-treatment controls from randomized trials of treatment of choroidal neovascularization (CNV) secondary to age-related macular degeneration (AMD).2 Among pairs of sham and no-treatment controls matched on all 8 prognostic factors identified from the literature, 2-year visual acuity outcomes were similar. When pairs of sham and no-treatment controls were matched on fewer than 8 criteria (ie, 4-7 criteria), sham controls tended to have somewhat better 2-year visual acuity outcomes than no-treatment controls. However, we identified some possible explanations for the latter finding.
Visual acuity is a quasi-objective measurement made by a trained examiner but nevertheless requires cooperation from the patient and patience from the examiner. Measurement of patient-reported outcomes has been of value in many clinical trials in ophthalmology to assess the impact of interventions, often applied to 1 eye, on patient perceptions of visual function when taking account of both eyes. These measurements are considered to be more subjective than visual acuity measurements. Hróbjartsson and Gøtzsche1 found a “statistically significant moderate difference”1(p8) between placebo (sham) and no-treatment groups for “trials with continuous outcomes reported by patients.”1(p8) Our earlier investigation and this finding from their systematic review led us to continue our investigation by undertaking an indirect comparison of patient-reported outcomes derived from interview responses. The purpose of this article is to report comparisons of 2-year changes in scores on the 25-item National Eye Institute Visual Function Questionnaire (NEI-VFQ)3 between matched sham and no-treatment controls from 2 randomized trials4,5 of treatment of subfoveal CNV in AMD.
This investigation was reviewed and approved by one of the institutional review boards of the Johns Hopkins School of Medicine, by the executive committee of the Submacular Surgery Trials (SST),5 and by responsible parties at Genentech Inc, South San Francisco, California, through Genentech's investigator-sponsored trial procedures.
We obtained from Genentech, the industry sponsor of the Minimally Classic/Occult Trial of the Anti-VEGF Antibody Ranibizumab in the Treatment of Neovascular Age-Related Macular Degeneration (MARINA),4 and from the SST coordinating center data for all patient participants in the control arms of 2 completed randomized trials of treatments for subfoveal CNV secondary to AMD. No personal identifying information was requested or provided. In MARINA, a sham control (sham injection without a needle) was used, with the goal of masking the patient participant.4 In the SST group N trial (SST-N),5 a no-treatment (observation) control was used, so that the patient participant was unmasked.
The eligibility criteria for SST-N specified that the subfoveal neovascular lesion had to contain some classic CNV; classic CNV was not required for participation in MARINA (Table 1). Thus, we focused this retrospective matched pairs analysis on controls in the 2 trials who had AMD in the study eye, a neovascular lesion located under the geometric center of the foveal avascular zone of the study eye, and some classic CNV in the subfoveal neovascular lesion. Table 1 summarizes other characteristics and eligibility criteria of the 2 trials.
In both trials, the NEI-VFQ was administered at baseline and during follow-up by an interviewer masked to the treatment arm to which the participant had been assigned.7,8 Interviews with SST participants were administered by telephone at baseline and during the follow-up period by trained personnel located at the SST coordinating center.7 Interviews with MARINA participants were administered by trained personnel at participating clinical centers.8
The NEI-VFQ was 1 of the interview instruments administered in each trial.7,8 Each group of investigators elected to add optional items to the 25 base set of vision-targeted questions of the NEI-VFQ; however, the items added differed between the trials. Thus, we restricted our analysis to the 25-item version of the NEI-VFQ. Scores were calculated using the method provided by the developers of the NEI-VFQ.3 In both trials, the 36-Item Short Form Health Survey (SF-36)9 also was administered as part of the baseline and follow-up interviews. In place of the single-item general health subscale of the NEI-VFQ, we compared 2-year changes in scores on the physical and mental component summary scales from that instrument10 between sham and no-treatment controls. The SF-36 physical and mental component summary scores were calculated as described elsewhere.10
We attempted to match each eligible sham control to 1 or more no-treatment controls using baseline characteristics reported to be associated with NEI-VFQ scores or changes in scores7,11: overall NEI-VFQ score (within 5 points); best-corrected visual acuity of the better-seeing eye (within 7 letters; ie, 0.14 logMAR units); presence or absence of CNV in the nonstudy eye; SF-36 physical component summary score (within 10 points); SF-36 mental component summary score (within 10 points); age (within 3 years); and sex. Methods for measuring and scoring visual acuity have been published for each trial.4,5
We compared distributions and means of 2-year changes in NEI-VFQ overall scores and in subscale scores between sham and no-treatment controls. Findings were analyzed both for participants interviewed at 2 years and for all participants with any follow-up interview by 2 years using the method of the last follow-up observation carried forward (LOCF method) to impute missing 2-year interview data.
Distributions have been displayed using box-and-whisker plots12 and summary statistics. Wilcoxon matched pairs signed rank sum tests13 were used to assess similarity of distributions and paired t tests were used to assess similarity of mean values. Linear regression models were used to evaluate the effect on 2-year changes in NEI-VFQ scores of the observed difference in best-corrected visual acuity of the worse-seeing eye at baseline.
SAS software, version 8.2 (SAS Inc, Cary, North Carolina) was used for all analyses. We did not adjust probabilities to account for multiple comparisons; probability values of .05 or less were deemed indicators of noteworthy differences between the sham and no-treatment controls.
Of the 238 sham controls, 87 had some classic CNV in the subfoveal lesion in the study eye with AMD4 and thus were eligible for this analysis. Of these 87 eligible sham controls, 62 were matched to no-treatment controls on all 7 criteria. The baseline characteristics corresponding to criteria used to select the 62 matched pairs of sham and no-treatment controls were similar in the 2 groups (Table 2), as expected. Although only the NEI-VFQ overall score was used for matching, NEI-VFQ mean subscale scores of the 2 groups of controls also were similar. The best-corrected visual acuity of the worse-seeing eye, which was not one of the matching criteria, was poorer among no-treatment controls than among sham controls (P < .001, Wilcoxon matched pairs signed rank sum test).
Thirteen of the 62 sham controls (21%) and 9 of the 62 no-treatment controls (15%) were not interviewed at 2 years. This comparison provides no evidence that masking of patient participants and clinical personnel via sham treatment played an important role in completeness of follow-up for patient-reported quality-of-life outcomes.
For analysis of observed 2-year changes in NEI-VFQ scores, data were available for 42 pairs of matched controls. Of the 13 sham controls who were not interviewed at 2 years, 2 had no follow-up interview. Of the 9 no-treatment controls in the group who were not interviewed at 2 years, 4 had no follow-up interview; 2 of these 4 no-treatment controls died within the first month after enrollment. Thus, 56 pairs of matched controls were available for analyses using the LOCF method.
The distributions of 2-year changes in overall NEI-VFQ scores are shown in the Figure for the 42 pairs of sham and no-treatment controls who were interviewed at 2 years. The distribution of the paired differences in 2-year change scores (sham control 2-year change minus matched no-treatment control 2-year change) also is displayed in the Figure. The distributions of 2-year change were similar between sham and no-treatment controls, except for somewhat more variability in change scores among sham controls. The distribution of paired difference in change scores had both a mean and median close to zero. Nearly identical distributions were obtained using the LOCF method (data not shown).
Medians and limits of interquartile ranges (25th and 75th percentiles) of the distributions of change in NEI-VFQ scores from baseline to 2 years are listed in Table 3 for the 42 matched pairs of controls who were interviewed at 2 years and in Table 4 for the 56 matched pairs of controls with data analyzed using the LOCF method. The median paired difference in change scores for the overall (composite) NEI-VFQ was less than 2 units by each method of analysis. The mean paired differences and 95% confidence intervals (CIs) were 0.9 (95% CI, −32.2 to +34.0) and 0.6 (95% CI, −33.0 to 34.1), respectively, for only those interviewed at 2 years and for all who were interviewed during follow-up (LOCF method).
The largest differences in subscale change scores between the 42 sham and no-treatment controls with 2-year interviews were observed for the driving subscale (P = .06) and mental health subscale (P = .08). Interquartile ranges were wide for both subscales. The comparison for the SF-36 mental component summary scale scores (data not shown) was consistent with the NEI-VFQ mental health subscale, with a difference in 2-year change scores of 3.3 (P = .11). The mean difference in 2-year change in SF-36 physical component summary scale scores was 0.9. When the LOCF method of analysis was used, none of the differences in 2-year change scores for the NEI-VFQ subscales or the SF-36 component summary scales were noteworthy (P > .10 for all). The discrepancy between matched pairs with respect to the baseline best-corrected visual acuity of the worse-seeing eye did not influence differences in change scores between types of controls when evaluated using linear regression models.
In this retrospective comparison of 2-year patient-reported outcomes, measured using the NEI-VFQ and SF-36, in sham vs no-treatment controls who participated in randomized controlled trials of treatments for subfoveal CNV secondary to AMD, the estimated difference in outcome (placebo effect) was small or trivial among matched pairs, regardless of whether 2-year changes in scores were analyzed using data only from controls who completed 2-year interviews or using the LOCF method to impute scores for missed 2-year interviews. Our retrospective analyses among matched pairs of controls of patient-reported outcomes summarized herein and of visual acuity outcomes published earlier2 are consistent with the conclusions of Hróbjartsson and Gøtzsche,1 who found from their review of clinical trials in various medical conditions “no evidence that placebo interventions in general have clinically important effects.”1(p10) The sizes of the overall NEI-VFQ change scores in both groups of controls from both methods of data analysis were relatively small (Table 3 and Table 4), a finding consistent with the corresponding difference in 2-year change in best-corrected visual acuity of the initially better-seeing eye of approximately 1 line.14 Although the observed difference in change scores was small, the 95% confidence interval on that difference was wide, suggesting that the “true” difference in overall NEI-VFQ scores could be as large as 30 to 35 units in either direction.
It is possible that differences between the 2 randomized trials in eligibility criteria that were not involved in matching could have affected NEI-VFQ outcomes. The primary differences in eligibility criteria concerned the study eye: the amount of classic CNV in the subfoveal lesion (in MARINA, absent or <50% if present; in SST-N, present and up to 100%), lesion size restrictions (≤12 disc areas in MARINA, ≤9 disc areas in SST-N), and permission to enroll patients in MARINA, but not in SST-N, whose study eyes had received earlier thermal laser photocoagulation treatment of an extrafoveal or juxtafoveal neovascular lesion. Furthermore, eyes eligible for the MARINA trial had to have evidence of recent disease progression. Although we restricted sham controls to those with some classic CNV, we did not restrict matches to no-treatment controls in whose study eyes classic CNV accounted for less than 50% of the lesion. Thus, classic CNV likely accounted for a larger proportion of the neovascular lesion among no-treatment controls than among sham controls. This difference and differences in size of the subfoveal lesion and preenrollment photocoagulation likely were accounted for by best-corrected visual acuity. Despite these differences between the 2 trials, 2-year changes in NEI-VFQ scores, overall and by subscale, as well as 2-year changes in SF-36 summary scores, were similar. We observed a difference in best-corrected visual acuity of the worse-seeing eye at baseline; however, that difference did not influence 2-year changes in NEI-VFQ scores.
As in our previous investigation of visual acuity outcomes,2 our analysis was retrospective and limited to a single ophthalmologic diagnosis among control arm patient participants from 2 of the randomized controlled trials completed for this condition. As we noted in our earlier report, retrospective analyses cannot replace randomized comparisons of sham and no-treatment controls in clinical trials. Furthermore, only about one-quarter of the sham controls could be matched to a no-treatment control. Among the 62 pairs of matched controls, one or both members of 20 pairs missed the 2-year interview (32%); one member of each of 6 pairs never was interviewed during follow-up (10%). Thus, missing data further limited the sample size for analysis of outcomes. The small sample size also limited the types of analyses that could be used for robust imputation of scores from missed 2-year interviews.
Despite the small number of sham and no-treatment controls who could be matched for this investigation (ie, approximately one-quarter of the number of controls in each of the 2 trials), baseline characteristics and 2-year changes in NEI-VFQ scores among this subgroup were similar to those of the larger group of controls in the 2 trials from which they were identified.7,8 The 2 clinical trials from which data were obtained for this analysis were of high quality. Standard protocols were followed to assess eligibility of patient participants for each trial and to acquire the data used to measure outcomes. In both trials, at baseline and during follow up, interviewers were masked to random assignment to control or active treatment arm. Findings from pairs of controls interviewed at 2 years did not differ substantially from those when the LOCF method was used to impute missed 2-year interview scores.
Our previous investigation2 suggested that masking of patient participants via a sham intervention may not be necessary when visual acuity outcomes are evaluated following measurements made according to a standard protocol, particularly when the visual acuity examiner is masked to random assignment. The current analysis suggests that masking of patient participants via a sham or placebo intervention may be unnecessary when patient-reported outcomes based on the NEI-VFQ and administered by interviewers masked to treatment assignment are of interest. Much more extensive investigation in larger studies and, ideally, in studies of different designs is necessary to delineate situations in which no-treatment controls may be sufficient. We encourage other researchers to undertake similar analyses using data sets from other pairs of randomized trials in which similar patients are enrolled but the types of controls differ. Because treatments currently are available that usually delay, prevent, or reverse visual acuity loss resulting from subfoveal CNV in AMD, it is unlikely that there will be an opportunity to make a randomized comparison in future trials of treatments for such patients. However, we continue to encourage sponsors and designers of clinical trials of new therapeutic and preventive interventions for ophthalmologic conditions for which no effective treatment is available to consider a design in which patients assigned to the control arm are randomized between sham and no treatment, regardless of whether visual acuity, patient-reported outcomes, or other outcomes are assessed. Evidence from multiple trials with internally randomized control arms would help to delineate the circumstances that require sham controls to minimize bias in estimating treatment effects, along with the risk, cost, deception, and inconvenience they entail, and those in which no-treatment controls are acceptable. Until such a body of evidence is available to inform decisions regarding selection of the control by researchers and by regulatory agencies that approve clinical trial protocols and new interventions, we encourage designers of clinical trials to consider all scientific, ethical, logistic, and financial aspects of masking of clinical trial participants before deciding whether a sham control or a no-treatment control, together with masked outcome assessment and other protections against bias, is the better method to achieve trial objectives.
Correspondence: Barbara S. Hawkins, PhD, Wilmer Clinical Trials and Biometry, 550 N Broadway, Room 930, Baltimore, MD 21205-2010 (firstname.lastname@example.org).
Submitted for Publication: September 9, 2009; final revision received April 2, 2010; accepted May 20, 2010.
Author Contributions: All authors had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Financial Disclosure: Dr Bressler is principal investigator of grants to the Johns Hopkins University School of Medicine that are sponsored by Genentech Inc. Such grants are negotiated and administered by the Johns Hopkins University School of Medicine through the Office of Research Administration. Under School of Medicine policy, support for the costs of research administered by the institution does not constitute a financial conflict of interest.
Funding/Support: This investigation was supported in part by the Karl P. Hagen Professorship in Ophthalmology (Dr Hawkins), the James P. Gill Professorship in Ophthalmology (Dr Bressler), a grant from Genentech Inc to the Johns Hopkins University School of Medicine, the Retina Division Research Fund of the Wilmer Eye Institute, and an unrestricted grant to the Wilmer Eye Institute from Research to Prevent Blindness, New York, New York.
Role of the Sponsors: MARINA was sponsored by Genentech Inc and is registered at http://www.clinicaltrials.gov (NCT00056836). The SSTs were sponsored by the National Eye Institute, National Institutes of Health, and the US Department of Health and Human Services and are registered at http://www.clinicaltrials.gov (NCT00000150).
Additional Contributions: The MARINA sham control database was provided by Genentech Inc. Permission to use the no-treatment control database was provided by the SST archives committee.