Scatterplot showing the assumed vs observed corresponding noninferiority relative risks. Of the 56 trials included, 43 (76.8%) overestimated the final event rate in their assumptions.
The corresponding assumed relative noninferiority margins for each of the 57 trials that included absolute trial comparisons are represented by gray dotted lines. The corresponding observed relative noninferiority margins are represented by red or green dotted lines for comparisons with overestimation or underestimation of control event rates, respectively. The CI of the estimate corresponds to the α at which the end point was evaluated (ie, a 95% CI for end points evaluated using a 1-sided α of .025). Error bars may represent 90% (if 1-sided α = .05) or 95% CI (if 1-sided α = .025). BES indicates biolimus-eluting stent; BP, biodegradable polymer; DP, durable polymer; EES, everolimus-eluting stent; SES, sirolimus-eluting stent; ZES, zotarolimus-eluting stent.
A 2×2 table showing whether noninferiority was met using the trial's absolute margin and when using a corresponding assumed relative noninferiority margin. Among trial comparisons that met noninferiority with an absolute noninferiority margin, 17 of 50 trials (34.0%) would not have met noninferiority with a corresponding intended relative noninferiority margin. None of the trials that did not meet noninferiority using an absolute margin would have met noninferiority using their assumed relative noninferiority margin.
Most noninferiority coronary stent trials used absolute noninferiority margins and overestimated the control event rate. As a consequence, the absolute margin becomes disproportionally large, corresponding to a wider relative risk than originally assumed.
eTable 1. Search Strategy – PubMed/MEDLINE
eTable 2. Collected Data of Included Clinical Trials and Endpoints
eAppendix. Study Protocol
eFigure 1. PRISMA Diagram Summarizing the Literature Review
eFigure 2. Simulated Sample Sizes to Obtain 90% Power in Non-inferiority Trials
Customize your JAMA Network experience by selecting one or more topics from the list below.
Simonato M, Ben-Yehuda O, Vincent F, Zhang Z, Redfors B. Consequences of Inaccurate Assumptions in Coronary Stent Noninferiority Trials: A Systematic Review and Meta-analysis. JAMA Cardiol. 2022;7(3):320–327. doi:10.1001/jamacardio.2021.5724
What are the implications of inaccurate event rate assumptions on conclusions of noninferiority coronary stent trials?
Among 58 trials included in this systematic review and meta-analysis, event rates were overestimated by a mean of 28%; in trials with absolute margins (approximately 95%), overestimation led to more permissive noninferiority margins than originally assumed on the relative scale. When reevaluated using a corresponding relative noninferiority margin, 34% of trial comparisons that met noninferiority using the absolute margin failed to meet noninferiority.
Absolute noninferiority margins often represent disproportionally large relative risks owing to overestimation of control group event rates that inform the noninferiority margin at the trial design stage.
The outcome and interpretation of noninferiority trials depend on the magnitude of the noninferiority margin and whether a relative or absolute noninferiority margin is used and may be affected by imprecision in event rate estimation.
To assess the consequence of imprecise event rate estimations on interpretation of peer-reviewed randomized clinical trials.
PubMed/MEDLINE was searched for articles published between January 1, 2015, and April 30, 2021.
Noninferiority randomized clinical trials of coronary stents published in selected journals with clinical events as the primary end point.
Data Extraction and Synthesis
Two reviewers (M.S. and F.V.) independently extracted data on trial characteristics, noninferiority assumptions, primary end point clinical outcomes, and study conclusions. Overestimation or underestimation of the control event rate was evaluated by dividing the assumed control event rate by the observed control event rate. For noninferiority end points with absolute margins, the assumed corresponding relative margin was defined as the ratio of the absolute margin and the assumed event rate, and the observed corresponding relative margin as the ratio between the absolute margin and the observed event rate in the control arm. Noninferiority comparisons with absolute margins were reanalyzed using the assumed corresponding relative margin and the Farrington-Manning score test for relative risk.
Main Outcomes and Measures
Overestimation or underestimation, assumed and observed corresponding relative margins, and relative reanalysis of the primary end points of trials with absolute margins.
A total of 106 989 patients from 58 trials were included. The event rate in the control arms was overestimated by a median (IQR) of 28% (2%-74%). Most noninferiority trials used absolute rather than relative margins (55 of 58 trials [94.8%]). Owing to overestimation, absolute noninferiority margins became more permissive than originally assumed (median [IQR] of observed relative noninferiority margin, 1.62 [1.50-1.80] vs assumed relative noninferiority margin, 1.47 [1.39-1.55]; P < .001). Among trial comparisons that met noninferiority with an absolute noninferiority margin, 17 of 50 trials (34.0%) would not have met noninferiority with a corresponding assumed relative noninferiority margin.
Conclusions and Relevance
In this systematic review and meta-analysis, assumed event rates were often overestimated in noninferiority coronary stent trials. Because most of these trials use absolute margins to define noninferiority, such overestimation results in excessively permissive relative noninferiority margins.
The use of noninferiority trial designs is increasing across multiple disciplines.1,2 Noninferiority trial designs are useful when a new treatment is expected to be as efficacious as a current treatment (rather than more efficacious) but offers other advantages, such as a better safety profile, lower cost, or wider accessibility.3-5
Whereas noninferiority trial designs are useful in determining that a new therapy performs at least similarly to an established therapy, there are additional aspects that need to be considered when interpreting the results of a noninferiority trial.3,6 Arguably the most important parameter in a noninferiority trial is the noninferiority margin. The noninferiority margin represents how much worse than the control treatment the novel treatment can be and still be considered acceptable (ie, noninferior). A noninferiority margin can be absolute (ie, a preestablished absolute risk increased between 2 interventions) or relative (ie, a certain preestablished relative risk or hazard ratio). Whereas relative noninferiority margins correspond to the same relative risk or hazard ratio irrespective of the magnitude of the observed event rate in the trial and are more generalizable, an absolute margin will correspond to a relatively larger relative risk (or hazard ratio) if the observed event rates are lower than expected.5,7 If the observed and assumed event rates in a trial differ substantially, the interpretation of a trial that uses an absolute margin may become difficult. Noninferiority trial designs are commonly used to evaluate the performance of new coronary stents compared with already approved stents,8-10 allowing for a case study to evaluate this design modality.
A systematic review of the literature was performed to identify randomized clinical trials in interventional cardiology, according to the principles of the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) reporting guideline.11 The following criteria were used to build the search strategy: (1) the population included patients with coronary artery disease; (2) the interventions studied were coronary stents used in an interventional cardiology setting; (3) the controls were active, as determined by the investigators; (4) outcomes were categorical clinical events; and (5) studies were randomized clinical trials (designed to test for noninferiority of the intervention vs the control) published in JAMA, The Lancet, The New England Journal of Medicine, The BMJ, Circulation, European Heart Journal, Journal of the American College of Cardiology, JAMA Cardiology, Circulation: Cardiovascular Interventions, JACC: Cardiovascular Interventions, or EuroIntervention. All articles from these journals were published in English. There were no geographic restrictions. Studies that used a bayesian trial design were excluded.
The PubMed/MEDLINE database was searched for articles published between January 1, 2015, and April 30, 2021. Trials originally published before January 1, 2015, were considered eligible for inclusion if the literature search identified novel subanalyses or follow-up studies published during the inclusion period. In such cases, the original study was included in the analysis and can be identified among studies found indirectly in the PRISMA diagram. Two reviewers (M.S. and F.V.) independently and systematically reviewed the studies in the database, screened abstracts, and confirmed eligibility through full-text assessment. A third reviewer (B.R.) made the final decision in the case of disagreement. The detailed search strategy is presented in eTable 1 in the Supplement.
Baseline characteristics for the study cohort were collected for all selected trials. The following information was collected for all individual primary end points in the trials: end point definition, timing of the analysis, whether an adaptive design was used, and whether the trial was stopped early. The following details regarding the assumptions underlying the sample size calculations for each powered end point were collected: expected event rate in the control group, expected event rate in the treatment group, magnitude of the noninferiority margin, whether an absolute or relative noninferiority margin was used, and the calculated power and sample size for the trial. The observed event rates in the control and treatment groups and the reported outcome of the statistical hypothesis test (ie, whether the trial met noninferiority) were also collected. Industry-sponsored studies were defined as those in which the ClinicalTrials.gov registration page of the trial listed a manufacturer as a primary sponsor. Studies were considered to be US studies if they included patients from a US hospital. Discrepant trial comparisons were defined as those that were declared noninferior by investigators using the original absolute margin analysis, but that did not meet noninferiority in the corresponding assumed relative margin analysis.
To determine the degree of disparity between the estimated and actually observed event rates in the trials, the ratio between the assumed and observed control event rates was calculated. For a trial with overestimation, this ratio would be greater than 1 (as the assumed rate is the numerator and the observed rate is the denominator). Conversely, for a trial with underestimation, this ratio would be less than 1. For noninferiority trials using absolute noninferiority margins, the influence of overestimation and underestimation of the event rate in the control group on the observed corresponding relative margin (ie, the relative risk corresponding to noninferiority) was evaluated as follows:
Corresponding Relative Margin = (Absolute Margin + Control Event Rate)/Control Event Rate
The corresponding assumed and observed relative noninferiority margins were derived per the above formula using the absolute noninferiority margin and the assumed or observed event rate in the control arm. For each trial, the observed noninferiority relative margin vs the assumed noninferiority relative margin were plotted.
To determine whether the choice of relative rather than absolute noninferiority margin would alter the conclusion of the trial (ie, whether the trial would have met noninferiority if a relative margin had been used), the relative noninferiority margin that corresponded to the absolute noninferiority margin used in the trial was calculated as stated above. The confidence interval of the observed relative risk in the treatment vs control arms was calculated with the Farrington-Manning score test for relative risk.12 The size of the confidence interval (z) was based on the 1-sided α level provided by the investigators. If the calculated P value for noninferiority was less than the selected 1-sided α level, the trial was considered to be noninferior.
Baseline results are presented as mean (SD) for normally distributed continuous variables, median (IQR; 25th to 75th percentiles) for non-normally distributed continuous variables, and number/total number (percentage) for categorical data. The Mann-Whitney U test was used for comparisons of non-normally distributed continuous variables between 2 groups. The related-samples Wilcoxon signed rank test was used for comparisons of non-normally distributed continuous variables within end points. A 2-sided P <.05 was considered statistically significant. Statistical analyses were performed with SAS, version 9.4 (SAS Institute). The study was not eligible for PROSPERO registration owing to its methodological nature.13 The study protocol is available in the eAppendix in the Supplement.
Of 3219 identified studies, 3098 (96.2%) were excluded after screening of the title or abstract alone, and 121 (3.8%) full-text articles were evaluated for eligibility. Among these 121 studies, 63 (2.0%) were excluded, leaving 58 trials (1.8%) with 106 989 patients for inclusion (eFigure 1 in the Supplement). A list of the included publications is available in eTable 2 in the Supplement.
Among the included 58 trials, 56 trials had both a single control and treatment arm. The BIO-RESORT trial14 had 2 active treatments arms and 1 control arm, and the CHOICE trial15 had 1 active treatment arm and 2 control arms, which resulted in 59 treatment arms, 59 control arms, and 60 powered primary noninferiority trial comparisons (57 trial comparisons with absolute margins and 3 trial comparisons with relative margins). The trials included were large (median [IQR], 1838 [1122-2575] patients), with median (IQR) projected power of 83% (80%-90%) (Table). Of the 58 trials, 20 (34.5%) were industry sponsored, and 24 (41.3%) were published in general medicine journals. Among the included trials, 43 (74.1%) used 12 months as the time for the primary end point comparison (range, 1-36 months). A 1-sided α of either .05 (37 of 58 [63.8%] studies) or .025 (21 of 58 [36.2%] studies) was used. The most common population reported in the primary end point analysis was the intention-to-treat cohort (55 of 58 studies [94.8%]).
When considering only control groups of noninferiority trials with absolute margins (n = 56), the median (IQR) absolute noninferiority margin was 3.6% (3.0%-4.4%), and the median (IQR) assumed-to-observed ratio of the control event rate was 1.28 (1.02-1.74). The control event rate was overestimated in 43 control arms and underestimated in 13 control arms. Owing to overestimation of the event rate, the corresponding observed noninferiority relative margin became more permissive (ie, easier to declare noninferiority) than the corresponding assumed noninferiority relative margin (median [IQR], 1.62 [1.50-1.80] vs 1.47 [1.39-1.55]; P < .001). Among 38 trial comparisons with a corresponding assumed noninferiority relative margin 1.5 or less, the corresponding observed noninferiority relative margin was greater than 1.5 in 24 trials (63.2%). Figure 1 plots the corresponding assumed noninferiority relative margin against the corresponding observed noninferiority relative margin.
When the results of the noninferiority trials were re-evaluated using the ratio between the absolute noninferiority margin and the assumed event rate as a relative noninferiority margin, 17 of 50 (34.0%) trial comparisons that met noninferiority using the absolute margin did not meet noninferiority using the relative margin (Figures 2 and 3). Trials that met noninferiority by the absolute but not the corresponding relative noninferiority margin (defined as discrepant trials) overestimated their event rates statistically significantly more than those that met noninferiority both with absolute and relative noninferiority margins (median [IQR] assumed-to-observed ratio, 1.90 [1.30-2.01] vs 1.14 [0.94-1.38]; P < .001). All discrepant trials overestimated their event rates, and the smallest overestimation was 11.7%. While there was no difference in the corresponding assumed noninferiority relative margin between discrepant and nondiscrepant trial comparisons (median [IQR], 1.53 [1.38-1.64] vs 1.45 [1.38-1.50]; P = .25), the corresponding observed noninferiority margin was statistically significantly more permissive for the discrepant trial comparisons than nondiscrepant trial comparisons (median [IQR], 1.78 [1.59-2.14] vs 1.57 [1.42-1.68]; P = .002). Conversely, all of the 7 trial comparisons that did not meet noninferiority using an absolute margin also did not meet noninferiority with a relative margin.
Put differently, the observed relative margin was larger than the corresponding assumed relative margin in 77% of trial comparisons, with a median (IQR) relative increase of 8% (1%-21%). In discrepant trials, the observed relative margin was larger than the corresponding assumed relative margin in all trial comparisons, with a median (IQR) relative increase of 26% (9%-36%).
Among 404 sets of simulated power calculations with increasing event rates using either an absolute noninferiority margin or the corresponding relative noninferiority margin, the required sample size to achieve 90% power was consistently higher for trials using a relative noninferiority margin (eFigure 2 in the Supplement).
This case study of coronary stent trials is, to our knowledge, the first major systematic review and meta-analysis evaluating the precision of end point estimation in contemporary noninferiority trials and the association of overestimation and underestimation of event rates with the trial conclusions. We found that (1) event rate assumptions were commonly overestimated; (2) noninferiority trials with absolute margins are the most common type of noninferiority design among coronary stent trials and often result in noninferiority margins that represent disproportionally larger relative risks owing to lower observed event rates than what was assumed at the trial design stage; and (3) more than one-third of powered primary trial comparisons in noninferiority coronary stent trials with absolute margins that met noninferiority would not have met noninferiority if a corresponding relative noninferiority margin had been used.
The results illustrate the need for a critical interpretation of the noninferiority margin, which is an inherently subjective element of noninferiority designs that is absent from traditional superiority studies. Although guidance exists as to what constitutes a reasonable noninferiority margin,5 there is considerable variation in the magnitude of noninferiority margins used in contemporary trials, and not all noninferiority trials provide appropriate justifications for their noninferiority margin.16 Furthermore, this study illustrates that the use of an absolute rather than relative noninferiority margin can render the trial results difficult to interpret when the observed event rates in the trial are different from the assumed event rate. We chose to focus this present study on stent trials given their expansive track history. The issues identified are not limited to coronary stent studies and may be present in other fields. Indeed, the current work represents the largest data set published on this topic to our knowledge, and the findings will inform the design of future trials in the expanding field of interventional cardiology, including comparative studies of different transcatheter and surgical treatments in structural heart disease.
Even though most noninferiority trials dedicate considerable effort to defining a reasonable noninferiority margin as it relates to the expected event rate in the trial, the relative risk that an absolute noninferiority margin ultimately corresponds to is dependent on the event rate that is observed in the trial (Figure 4). This study shows that among real-world noninferiority trials in interventional cardiology, many noninferiority margins ultimately corresponded to substantially higher observed relative risks than were anticipated at the trial design stage. If the observed relative risk corresponding to the absolute noninferiority margin is substantially larger than what was anticipated when the trial was designed, then the trial may meet noninferiority even if the 95% CI of the relative efficacy vs the established therapy (control arm) extends above a clinically unacceptable relative risk. Thus, trials with absolute noninferiority margins differ from trials with relative noninferiority margins and superiority trials in that statistical power is increased rather than reduced when event rates are lower than expected.
As an illustrative example of this issue, consider a noninferiority trial in which the event rate is expected to be 10% for both the control and treatment arms. Based on the expected event rate of 10%, and absolute noninferiority margin of 5% (equivalent to a relative risk of 1.5, similar to the median relative risk of the present collected trials) may seem reasonable; however, if the actual event rate observed in the control arm is 6.25% instead of 10%, then this noninferiority margin instead corresponds to a relative risk of 1.8 (which corresponds to the median observed margin among the discrepant trials). The study team observed that in 35.2% of discrepant trials, the corresponding observed relative margin was greater than 2.0 (ie, permitting a possible doubling of the risk of the primary end point with the new stent compared with standard of care). In the present study, most noninferiority trials with absolute noninferiority margins overestimated the event rate in the control arm in their assumptions, which means that the noninferiority margin against which the treatment effect was tested represented a larger relative risk than what was suggested in the trial design. In more than half of the discrepant trials, the noninferiority margin against which the treatment effect was tested represented a relative risk that was more than 25% larger than what was suggested in the trial design.
Although noninferiority trials tended to overestimate event rates, underestimation of event rates in a noninferiority trial with an absolute margin can also be problematic. Underestimation of event rates can have the opposite association with trial conclusions (Figure 4). That is, in a situation of severe underestimation of the event rate on which the absolute margin is based, the corresponding observed noninferiority margin may become narrower as a relative margin than intended and make it too difficult to declare noninferiority. In this study, there were no instances of such trials using an absolute margin that failed to declare noninferiority but would have been shown to be noninferior if they had used a relative margin. However, this has recently happened in OPTIMIZE IDE trial,17 which was unpublished when the present systematic review was conducted. OPTIMIZE IDE randomized 1639 patients to a novel stent with direct stenting capabilities vs more traditional drug-eluting stents and compared these devices in regard to target lesion failure rates at 12 months. The investigators assumed an event rate of 6.5% in the control arm and defined a noninferiority margin of 3.58% (corresponding to a relative margin of 1.55). When they observed substantially higher than expected event rates (10.3% in the treatment group and 9.5% in the active control group driven by elevated rates of periprocedural myocardial infarction) they did not meet their noninferiority margin, despite a relative risk of 1.09 (95% CI, 0.81-1.46).
Although the use of an absolute rather than relative noninferiority margin introduces some additional challenges when it comes to the interpretation of trial results, an absolute margin affords greater power than a corresponding relative margin when all other assumptions are held constant (eFigure 1 in the Supplement). Absolute noninferiority margins are therefore attractive for cost and other practical reasons, particularly when event rates can be reliably estimated. Absolute risk differences also ultimately matter to patients. If a relative rather than absolute noninferiority margin is used and the observed event rate is very low, the relative noninferiority margin may correspond to a clinically negligible absolute risk difference. Still, relative risk more accurately captures comparisons between treatments, implicit in the term relative itself. Moreover, the actual absolute margins used in the stent trials in this meta-analysis were not clinically insignificant.
It is possible to mitigate the risks of drawing inaccurate conclusions based on absolute noninferiority margins in trials with imprecise end point assumptions. To improve the quality of a noninferiority trial with an absolute margin, the investigators should reduce the risk of overestimating (or underestimating) the event rate in the control arm when deriving the noninferiority margin. The risk of overestimating the event rate in the control arm can be reduced by basing the expected event rate in the control arm on contemporary data derived from a sufficient number of high-quality studies. An example of a study that succeeded in doing this was the BIONICS study,18 which compared a ridaforolimus-eluting stent with the slow-release zotarolimus-eluting stent. The BIONICS investigators considered both event rates, end point definitions, and patient characteristics of 8 prior randomized clinical trials to inform their assumptions. As a consequence, the authors overestimated the event rate by only 0.4% in absolute terms (7% in relative terms), and the corresponding observed relative margin was only 3% larger in relative terms than the corresponding assumed relative margin. In addition to basing event rate assumptions on relevant and reliable data, interim analysis with adaptive design elements can be implemented to either extend follow-up duration to ensure an observed event rate closer to the assumed event rate or to revise the noninferiority margin to reflect a more reasonable relative risk increase. In practice, many noninferiority trials base their absolute noninferiority margin on an assumed event rate and an acceptable maximum relative risk.9,19-21 We propose that the same principle may be used to adapt the absolute margin if the observed event rate in the trial is considerably different from the assumed event rate chosen at the design stage.
A broader understanding of the noninferiority margin and the other unique elements of noninferiority trials is important to allow for critical interpretation of these trials and the merits of the therapies studied. When a trial nominally meets an absolute noninferiority margin, the decision to accept a new therapy as noninferior to its comparator should be based on an in-depth review of the trial. This review should include evaluation of the confidence interval for the effect size on a relative scale and its comparison of the observed relative risk to the assumed relative noninferiority margin at trial planning, as well as assessment of the quality of study procedures (ie, randomization, crossover, end point ascertainment). The use of a relative noninferiority margin should be considered at the design stage, particularly if the noninferiority margin is intended to ensure that a previously observed benefit of the control treatment is preserved.3
This analysis has limitations. This systematic review was limited to coronary stent trials in interventional cardiology. However, the underlying principle of overestimation and its relationship to absolute margins is applicable in other settings. Despite every attempt to create a broad search strategy, the investigators were able to identify only a small number of noninferiority trials with relative margins. Therefore, the study has a more limited ability to identify issues with this method. Additionally, the investigators did not have access to the original data sets of all the included trials. Therefore, all of the analyses were based on the published event rates, thus relying on the precision reported in the final article of each trial.
Results of this systematic review and meta-analysis demonstrated that more than one-third of noninferiority coronary stent trials using absolute margins that met noninferiority would not have met noninferiority if a corresponding relative noninferiority margin had been used. Noninferiority trials with absolute margins often result in noninferiority margins that represent disproportionally large relative risks due to overestimation of the control group event rates that inform the noninferiority margin at the trial design stage. These findings should be taken into consideration in the design and interpretation of future noninferiority trials.
Accepted for Publication: October 19, 2021.
Published Online: February 2, 2022. doi:10.1001/jamacardio.2021.5724
Corresponding Author: Björn Redfors, MD, PhD, Clinical Trials Center, Cardiovascular Research Foundation, 1700 Broadway, 9th Floor, New York, NY 10019 (email@example.com).
Author Contributions: Drs Simonato and Redfors had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Simonato, Redfors.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: Simonato, Ben-Yehuda, Redfors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Simonato, Ben-Yehuda, Zhang, Redfors.
Administrative, technical, or material support: Ben-Yehuda, Vincent, Redfors.
Supervision: Ben-Yehuda, Redfors.
Conflict of Interest Disclosures: None reported.
Additional Information: Collected data may be provided following submission of study proposal to the corresponding author.