Patient inclusion flowchart. ACS indicates acute coronary syndrome; GRACE, Global Registry of Acute Coronary Events6; LBBB, left bundle-branch block; NSTEMI, non–ST-segment elevation myocardial infarction; STE, ST-segment elevation; RCT, randomized clinical trial.
Selected hospital outcomes and postdischarge to 6-month mortality. Unadjusted P<.001 across the 3 groups for all outcomes. RCT indicates randomized clinical trial. Reported P values adjusted for variables in the GRACE (Global Registry of Acute Coronary Events) risk model10 plus reperfusion and delay from symptom onset to arrival at the hospital.
Crude and adjusted odds ratios (95% confidence intervals) for hospital death comparing eligible patients vs randomized clinical trial (RCT) participants. *Adjusted for GRACE (Global Registry of Acute Coronary Events) risk score,10 use and type of reperfusion (primary percutaneous coronary intervention only, fibrinolytic only, both, or neither), and delay from symptom onset to hospital admission (<2, 2-5.9, and ≥6 hours); P = .38 for reperfusion by trial type interaction in eligible patients vs RCT participants. †Adjusted for the GRACE risk score.
Steg PG, López-Sendón J, Lopez de Sa E, Goodman SG, Gore JM, Anderson FA, Himbert D, Allegrone J, Van de Werf F, . External Validity of Clinical Trials in Acute Myocardial Infarction. Arch Intern Med. 2007;167(1):68-73. doi:10.1001/archinte.167.1.68
Copyright 2007 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2007
Patients enrolled in randomized clinical trials (RCTs) may not reflect those seen in real-life practice. Our goal was to compare patients eligible for enrollment but not enrolled in contemporary RCTs of reperfusion therapy with patients who would have been ineligible and also with patients with acute myocardial infarction (AMI) participating in RCTs.
Consecutive patients with AMI (n = 8469) enrolled in the GRACE registry (Global Registry of Acute Coronary Events) were divided into 3 groups: RCT participants (11%; n = 953), eligible nonenrolled patients (55%; n = 4669), and ineligible patients (34%; n = 2847). Our main outcome measures were hospital mortality rates.
Based on baseline characteristics or GRACE risk-score distribution, RCT participants had the lowest a priori risk of death; eligible patients had a higher risk; and ineligible patients had the highest risk. Actual hospital mortality showed a similar gradient (3.6%, 7.1%, and 11.4%, respectively) (P<.001). Multivariable analysis adjusting for baseline risk, use and type of reperfusion therapy, and delay from symptom onset to admission consistently showed a higher mortality rate for eligible nonenrolled patients than for RCT participants (odds ratio, 1.61; 95% confidence interval, 1.06-2.43; and odds ratio, 1.97; 95% confidence interval, 1.24-3.13, respectively).
Patients with AMI participating in RCTs have a lower baseline risk and experience lower mortality than nonenrolled patients, even when they are trial eligible. This difference is not entirely explained by differences in baseline risk, use and type of reperfusion therapy, and/or delays in presentation. Caution is necessary when extending the findings obtained in RCTs to the general population with AMI.
Randomized clinical trials (RCTs) provide the foundation for evidence-based medicine. The treatment of patients with acute myocardial infarction (AMI) has advanced dramatically through a series of large international RCTs evaluating the benefits and risks of new therapeutic strategies, and these advances have led to the codification of treatment through practice guidelines.1,2 The relevance of RCTs to clinical practice may be hampered by doubts regarding their external validity,3 particularly because they tend to recruit highly selected populations that may not be representative of patients encountered in everyday practice.4
The determinants of external validity of RCTs are numerous.3 Among these, both the selection and the characteristics of patients are important. Therefore, enrollment of very large numbers of patients in “megatrials” would hopefully result in less selection bias and increased representativeness,3 as has been the case in some trials.5 However, even large randomized megatrials may select nongeneralizable populations. It remains unclear whether patients enrolled in such trials accurately represent individuals with similar conditions routinely encountered in practice. Few data are available to compare trial participants with trial-eligible or trial-ineligible patients in the same setting over a similar period.
The aim of the present study was to use a multinational, contemporary cohort to compare patients eligible for enrollment but not enrolled in any of 3 large RCTs of coronary reperfusion therapy with patients who would have been ineligible for enrollment and also with patients with AMI who were participating in RCTs.
The Global Registry of Acute Coronary Events (GRACE) is a prospective, multinational study of patients hospitalized with a suspected acute coronary syndrome (ACS); the study design has been described previously.6 Included patients had to be 18 years or older, hospitalized for ACS, and have at least 1 of the following: electrocardiographic changes consistent with ACS, serial increases in cardiac biomarkers of necrosis, and/or documentation of significant coronary disease. A standardized case report form was used to collect information on patient demographic and treatment characteristics and hospital and 6-month outcomes. Electrocardiographic criteria for determining the presence of ST-segment elevation or left bundle-branch block were abstracted from the case report form.
The present analysis involved patients with a confirmed diagnosis of ST-segment elevation AMI. A total of 96 hospitals in 14 countries in Europe, North and South America, Australia, and New Zealand contributed data to this analysis. Patients transferred from other hospitals and those with a discharge diagnosis other than AMI were excluded. From the onset of the registry, patient participation in RCTs was captured prospectively. Patients were divided into 1 of the 3 following groups:
Current participants in an RCT;
Patients not enrolled in an RCT but meeting the eligibility criteria for 1 or more of the following contemporary trials: Assessment of the Safety and Efficacy of a New Thrombolytic Regimen (ASSENT) 37; Global Utilization of Streptokinase and Tissue-Plasminogen Activator for Occluded Coronary Arteries (GUSTO) V8; and Danish Multicenter Randomized Trial on Thrombolytic Therapy vs Acute Coronary Angioplasty in Acute Myocardial Infarction (DANAMI) 29; and
Patients not enrolled in an RCT and determined to be ineligible for participation in any of these 3 trials according to the entry criteria.
Patient eligibility was determined sequentially for each of the 3 trials. If 1 patient was judged eligible to participate in at least 1 RCT, that patient was deemed eligible for our study. If the patient failed to meet the criteria for any of the 3, he or she was deemed ineligible. The main eligibility criteria for the ASSENT-3, GUSTO V, and DANAMI-2 trials are summarized in the GRACE Appendix, along with the enrollment criteria that were unavailable from the GRACE data set (http://www.outcomes-umassmed.org/GRACE/bibliography.cfm). To adjust for differences in baseline risk between the patient groups, the population was divided according to tertile of risk based on the GRACE model for hospital mortality.10 The model is based on admission variables, with a potential risk score range from 0 to 372.
Data are summarized as medians or percentages, as appropriate. Comparison between the 3 study groups was based on the χ2 test for categorical variables and the Kruskal-Wallis test for continuous variables. Data were further stratified according to tertile of the GRACE risk score to evaluate differences in hospital mortality among groups at similar levels of baseline risk. The GRACE model is a strong predictor of in-hospital mortality and has been validated both internally in the GRACE registry as well as externally in the GUSTO IIb database in large cohorts (C statistics of 0.83 in the derived GRACE database, 0.83 in the confirmation GRACE data set, and 0.79 in the GUSTO IIb database).10 This risk model was subsequently confirmed by independent investigators to provide the best predictive accuracy compared with the Platelet Glycoprotein IIB/IIIA in Unstable Angina: Receptor Suppression Using Integrilin Therapy (PURSUIT) and Thrombolysis in Myocardial Infarction (TIMI) risk score models (although all 3 models provided good performance).11
Multivariable logistic regression analysis was used to evaluate the relationship between RCT groups and hospital death while adjusting for significant baseline predictors of risk according to the GRACE risk model (variables were age, Killip class, systolic blood pressure, ST-segment deviation, cardiac arrest at admission, serum creatinine level, elevated cardiac markers, and heart rate).10 An additional analysis used variables in the GRACE risk as well as the use and type of reperfusion therapy and the delay from onset of symptoms to admission. Odds ratios (ORs) with 95% confidence intervals (CIs) are reported for all pairwise comparisons. All analyses were performed using SAS software (version 9.1; SAS Institute, Cary, NC).
This analysis is based on data from 43 140 participants with suspected ACS enrolled in the GRACE registry between April 1999 and June 2004. After exclusion of patients with a final diagnosis other than ACS, patients transferred from other hospitals, and patients with missing data on RCT participation, a total of 8469 patients with a confirmed diagnosis of ST-segment elevation AMI were admitted directly to GRACE hospitals (Figure 1). Of these, 11.3% were participants in an RCT during the index hospitalization, 55.1% fulfilled the eligibility criteria for enrollment in at least 1 of the 3 RCTs but were not enrolled, and 33.6% of patients did not satisfy the inclusion criteria for any of the trials.
The baseline clinical characteristics of the 3 groups differed markedly (Table 1). Consistently higher-risk characteristics were observed in patients not enrolled in RCTs in comparison with RCT participants. Among patients not enrolled in an RCT, higher-risk characteristics were seen in ineligible vs eligible individuals. These characteristics included older age; more frequent history of myocardial infarction, transient ischemic attack or stroke, peripheral arterial disease, and coronary artery bypass grafting; greater prevalence of risk factors for atherosclerosis (except for hyperlipidemia and smoking); and a greater frequency of heart failure and cardiac arrest at initial presentation. Other important differences existed between groups with respect to the characteristics of the hospitals in which they were treated and the time of presentation (Table 1).
The hospital management of patients differed across the 3 groups: RCT participants were far more likely to undergo angiography and percutaneous coronary intervention than patients not enrolled in an RCT (Table 2). Participants in an RCT were more likely to receive aspirin and β-blockers and to undergo reperfusion therapy than nonenrolled patients (74% vs 57%) (P<.001), with similar use of primary percutaneous coronary intervention (21% vs 23%) (P = .16) and more frequent use of fibrinolytic therapy (43% vs 30%) (P<.001). Among nonenrolled patients, similar relationships were seen between eligible and ineligible patients, with the latter being far less likely to receive reperfusion therapy (69% vs 37%) (P<.001) (Table 2).
The distribution of risk according to tertile of GRACE score for hospital mortality is summarized in Table 3. Participants in an RCT had a lower risk distribution than nonenrolled patients. Likewise, among nonenrolled patients, eligible patients had a lower risk distribution than ineligible patients. The median scores (on a scale from 51.8-344.0) for each of the 3 groups were 138.1, 140.7, and 151.1, respectively.
Hospital outcomes (mortality, ventricular fibrillation or cardiac arrest, and cardiogenic shock) and 6-month postdischarge mortality (for hospital survivors) were consistently higher among ineligible patients, intermediate in eligible patients, and lowest in RCT participants (Figure 2). Hospital mortality was 3.6% in RCT participants, 7.1% in eligible patients, and 11.4% among ineligible patients (P<.001) (OR, 2.07; 95% CI, 1.44-2.97 for eligible vs RCT participants; OR, 1.68; 95% CI, 1.43-1.98 for eligible vs ineligible).
There was a consistent gradient of mortality in each risk tertile, which was lowest in patients participating in RCTs, intermediate in eligible patients, and highest in ineligible patients (Table 3). Given the importance of age and sex in the use and outcomes of reperfusion therapy in AMI, we examined hospital mortality across age and sex categories (Table 3) and found the same gradients of mortality. As expected in each group, older patients and women had higher hospital death rates than younger patients and men, respectively. But in each age or sex category, mortality was lowest in RCT participants, intermediate in eligible patients, and highest in ineligible patients.
Multivariable analysis was performed to adjust for baseline risk (GRACE risk model10), use and type of reperfusion therapy, and delay from symptom onset to admission. Regardless of the adjustment type, the ORs for hospital mortality were consistently higher for eligible patients than for RCT participants (Figure 3). Similar results were found when eligibility was assessed separately for each of the 3 trials analyzed (data not shown).
Important concerns have been raised regarding the external validity of RCTs.3 The representativeness of patients enrolled in RCTs is a particular issue, especially with respect to the inclusion of women and elderly individuals.10 Indeed, these are 2 examples of large subsets of patients in whom therapies proven effective in RCTs involving younger patients or men have not necessarily been found to yield the same benefit.12,13 While it is generally accepted that patients who are ineligible for enrollment have a worse baseline risk and outcomes than patients enrolled in RCTs,3,14,15 few data are available to assess the outcomes of nonenrolled eligible patients. It is usually assumed that, while RCTs enroll a highly selected population of patients, the outcomes of RCTs can be extrapolated to real-life patients who are not enrolled but fulfill the main inclusion and exclusion criteria of RCTs. A retrospective comparison of elderly patients enrolled in the GUSTO trial with patients from the Cooperative Cardiovascular Project (CCP) and the National Registry of Myocardial Infarction (NRMI) found that, despite baseline characteristics denoting a higher risk, eligible patients in the NRMI and CCP experienced a lower mortality than patients in the trial.16 An additional systematic review of comparisons between the population-based CCP survey data and data from the GUSTO I trial found that, despite differences regarding absolute rates of specific processes or outcome measures, data regarding treatment variations and risk factors for outcomes were generalizable to community patients.17 In addition, current trends toward the performance of ever larger megatrials stem not only from the need for larger sample sizes to demonstrate benefit but also from the desire to enroll patients who would be more representative of those treated in practice.
Indeed, the proportion of eligible patients in the present analysis was relatively high, reflecting the fact that the 3 RCTs considered were pragmatic in their enrollment strategy. Yet the present analysis suggests that, even in pragmatic trials, there remain important differences in baseline characteristics, baseline risk, and outcomes between eligible patients and participants. In fact, hospital mortality was doubled in unenrolled eligible patients compared with RCT participants (3.6% vs 7.1%), and this difference persisted after adjustment for baseline risk. It is unknown whether the benefits of therapies demonstrated in RCT participants can be extended safely to populations so markedly different in terms of their risk and outcomes. Approximately 1/3 of this large registry population would have been ineligible for enrollment in the 3 RCTs of reperfusion therapy. These ineligible patients differ in terms of their baseline characteristics from patients enrolled in RCTs and from those eligible for enrollment, and have consistently worse baseline characteristics, risk, and outcomes.
The improved risk-adjusted outcomes in RCT participants compared with eligible patients have several interpretations. They may be due to the beneficial impact of the experimental interventions and therapies tested in the RCTs. They might also result from the closer medical attention and overall better care being provided to RCT participants. In support of the latter hypothesis, we found a gradient in the use of evidence-based therapies (aspirin, β-blockers, and reperfusion therapy) and of revascularization from RCT participants to eligible and ineligible patients. Another potential explanation for the differences in risk-adjusted outcomes between eligible patients and RCT participants is the presence of residual confounding variables. Adjustment for baseline differences can be performed only for known confounders, but the frequency of unknown confounders is an important reason to perform RCTs. Interestingly, there were also differences in the hospital characteristics and in the time of presentation among the 3 groups that may play a role in the risk-adjusted outcomes because these factors are not taken into consideration in the risk calculation schemes.
The strength of the present analysis stems from the analysis of a single, large, multinational cohort in which eligibility and participation in RCTs, baseline characteristics, treatment, and outcomes were assessed prospectively and in a consistent fashion, as opposed to comparisons between trial participants and patients in enrollment logs. Outcome differences persisted after adjustment using the GRACE risk score10 and also after additional adjustment for use and type of reperfusion therapy and delay from symptom onset to admission. These results confirm and extend previous analyses, directly comparing AMI patients receiving fibrinolysis in the context of an RCT with patients receiving fibrinolysis but who were not enrolled.4,18- 21 Unlike some previous analyses,20 all participants were treated in the same hospitals, by the same teams, and over the same period, and the study allows assessment of RCT eligibility (as opposed to fibrinolysis eligibility) and comparison of patients who were eligible with ineligible patients. Our results appear at odds with the recent suggestion from a systematic review that participating in RCTs neither harms nor benefits participants compared with receiving similar treatment outside of these trials22; however, our single cohort is much larger than any of the studies previously reviewed, and the value of a systematic review in this context is limited by the fact that representativeness of RCT samples may need to be evaluated on a trial-by-trial basis.3,23
This analysis should not be viewed as detracting from the value of RCTs, as opposed to less rigorous observational studies, but rather as providing an example of the lingering problem with the generalizability of RCTs. The example of the recent mega-RCTs in AMI is demonstrative: because eligible patients differ markedly from trial participants and because 1/3 of patients are ineligible, with completely different baseline characteristics and outcomes, the outcomes of RCTs should always be extrapolated with caution to real-life patients. This is consistent with the observation that mortality is markedly higher in unselected cohorts of patients with AMI than in RCTs.24 In addition, risk models derived from RCT populations may misrepresent the actual risk of adverse outcomes or be applicable solely to patients closely mimicking the characteristics of trial patients.25 Models derived from less-selected populations are likely to be more generalizable to clinical practice.26 Likewise, trial event rate estimates derived from registries may overestimate the event rates seen in RCTs, which are usually lower than those observed in registries. However, our analysis does not necessarily imply that the treatment effect tested in RCTs will be directionally different or even smaller in eligible patients from the general population. (It may actually be greater given the higher baseline risk.)
While the GRACE case report form allows us to capture detailed clinical and paraclinical data regarding patient characteristics, trial eligibility, treatment, and outcomes, some of the eligibility criteria for the trials were not included in the form (GRACE Appendix http://www.outcomes-umassmed.org/GRACE/bibliography.cfm). Therefore, eligibility for the RCTs may have been overestimated, and the discrepancy among RCT participants and eligible and ineligible patients may be greater than described here. We did not capture information about the RCTs in which GRACE patients participated, and these were not necessarily trials involving reperfusion therapy. Finally, the findings in this analysis of RCTs in AMI do not necessarily apply to other therapeutic areas.27
In conclusion, although treated in the same group of hospitals over the same period, patients with AMI enrolled in RCTs differed markedly in terms of their baseline characteristics, hospital treatment, and outcomes from patients who would have been eligible for inclusion in recent mega-RCTs investigating reperfusion therapy and from ineligible patients. Caution is necessary when extending the findings obtained in RCTs to the general population. Continued efforts are needed to improve the external validity of RCTs, including simplification of enrollment criteria and inclusion of patients seen in routine practice.28
Correspondence: Philippe Gabriel Steg, MD, Department of Cardiology, Hôpital Bichat-Claude Bernard, Assistance Publique-Hôpitaux de Paris, 46 rue Henri Huchard, 75877 Paris CEDEX 18, France (firstname.lastname@example.org).
Accepted for Publication: September 22, 2006.
Author Contributions: Dr Steg had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Steg, López-Sendón, Lopez de Sa, Goodman, Gore, Anderson, Himbert, and Van de Werf. Acquisition of data: Steg, López-Sendón, Lopez de Sa, Goodman, Gore, Himbert, and Van de Werf. Analysis and interpretation of the data: Steg, López-Sendón, Goodman, Gore, Anderson, Himbert, Allegrone, and Van de Werf. Drafting of the manuscript: Steg. Critical revision of the manuscript for important intellectual content: Steg, López-Sendón, Lopez de Sa, Goodman, Gore, Anderson, Himbert, Allegrone, and Van de Werf. Statistical analysis: Allegrone. Administrative, technical or material support: Steg, López-Sendón, Goodman, Gore, Anderson, and Van de Werf. Study supervision: Steg, López-Sendón, Goodman, Gore, Anderson, and Van de Werf.
Financial Disclosure: None reported.
Funding/Support: The GRACE registry is supported by an unrestricted educational grant from sanofi-aventis, Paris, France, to the Center for Outcomes Research, University of Massachusetts Medical School, Worcester.
Group Information: A complete list of the GRACE Investigators can be found at http://www.outcomes-umassmed.org/grace.
Role of the Sponsor: Sanofi-aventis had no involvement in the collection, analysis, or interpretation of data; in the writing of the manuscript; or in the decision to submit the article for publication. The design, conduct, and interpretation of the GRACE data are undertaken by an independent steering committee.
Previous Presentation: This article was presented in part at the American Heart Scientific Sessions; November 2004; New Orleans, La.
Acknowledgment: We are indebted to Philippe Ravaud, MD, for helpful comments on the interpretation of our findings. We thank the physicians and nurses participating in GRACE and Sophie Rushton-Smith, PhD, for editorial assistance.