Assessing the Justification, Funding, Success, and Survival Outcomes of Randomized Noninferiority Trials of Cancer Drugs

Key Points Question Is the source of funding associated with the justification for using noninferiority design and success in claiming noninferiority in cancer drug trials, and are these trials associated with changes in patient survival? Findings In this systematic review and pooled analysis of 23 randomized noninferiority trials of cancer drugs, which used overall survival as the end point and enrolled 21 437 patients, industry funding was associated with lack of justification but not with success in achieving noninferiority. No association of noninferiority trials with patient survival was found. Meaning Greater regulatory attention should be paid to randomized noninferiority trials of cancer drugs, especially regarding the justification for using such a design.


Introduction
Randomized clinical trials designed with a noninferiority hypothesis test whether the experimental treatment is worse than the standard of care by a given margin called the noninferiority margin. 1 Noninferiority trials are useful to test interventions that aim at providing compensatory benefits without necessarily compromising efficacy. If the new drug or strategy reduces cost or adverse effects or improves the ease of administration, patients may reasonably accept the possibility of slight reductions in efficacy.
However, there are numerous concerns associated with defining, designing, conducting, reporting, and interpreting noninferiority trials, 1,2 including in oncology. 3 For example, how much of a possible reduction in efficacy is reasonable? How noninferiority limits are defined may raise ethical issues related to the sufficiency of patient informed consent. 4 In some cases, noninferiority designs have been useful. For example, the standard of care for the use of zoledronic acid for preventing skeletal-related events in patients with cancer and bone metastases used to be an injection every month. Noninferiority trials have now established that the frequency could safely be reduced to once every 3 months without compromising efficacy outcomes. 5,6 Noninferiority trials have been applied sparingly to oncology drugs because patients with cancer are unlikely to accept even the possibility of reduced efficacy of their oncology drugs.
However, in a 2018 case, a noninferiority trial served as the pivotal trial for US Food and Drug Administration approval of a new cancer drug, even though the new drug did not necessarily provide any benefits in terms of cost, ease of administration, or toxic effects. 7 Previous studies have shown that industry-funded trials were more likely to use a noninferiority design than nonindustry-sponsored trials. 8 In cancer treatment, noninferiority trials with overall survival (OS) outcomes are most critical from a clinical and regulatory point of view because it would be unwise to use a new therapy based on its noninferiority regarding response rates or progression-free survival without knowing the effects on OS, as the drug might very well have more substantially inferior outcomes on survival.
Similarly, patients are most concerned with compromise in survival when the results from noninferiority trials are translated in clinics.
To understand the characteristics of noninferiority trials used in evaluating cancer drugs, we sought to systematically review the use of the noninferiority hypothesis in randomized clinical trials in the field of oncology. In this systematic review and pooled analysis, we investigated the basis or reasoning for using noninferiority designs, the funding of these trials, and efficacy outcomes.

Methods
The objectives of this study were as follows: (1) to study the characteristics of randomized cancer drug trials that use noninferiority design, including the success rate and justification for using noninferiority designs, (2) to study whether the success in claiming noninferiority or the lack of justification for using noninferiority design was associated with the source of funding, and (3) to study the overall association of the drug being tested with patients' survival. This study was conducted in accordance with the modification of Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) reporting guideline for meta-epidemiological studies. 9

Study Identification
We conducted a systematic search of the PubMed database in March 2018 without date restrictions, supplemented with a search of the Google Scholar database. To identify published trials in cancer that used a noninferiority design, we used the search terms neoplasms or cancer or tumo* or malignan* or oncology AND non inferior* or non-inferior* or noninferior* and limited our search to findings published in English. After title and abstract screening by 2 of us (B.G. and F.A.T.) acting independently, the full texts of potentially relevant studies were downloaded and reviewed for the following exclusion criteria: (1) not a randomized design; (2) trials in pediatric populations; (3) trials of surgery or radiotherapy only in all treatment arms (trials comparing a drug in 1 arm and surgery or radiotherapy in other arms were eligible); (4) trials of decision support, diagnostic modalities, or supportive care; (5) trials assessing behavioral interventions, genetic counseling, screening, or diagnostic modalities; (6) trials assessing only pharmacokinetics; and (7) post hoc analyses. For this study, we limited our analysis to trials with a primary or a coprimary end point of OS for 2 reasons.
First, the most important concern for patients making treatment decisions based on noninferiority trials is the potential for compromise in survival, rather than compromise in surrogate measures.
Second, we planned to conduct a pooled analysis, and it would not be possible to pool survival with other surrogate measures across the trials. We also extracted information on the outcome of the trial in terms of whether noninferiority was achieved. A trial was considered successful if it achieved noninferiority based on its own criteria.

Data Extraction
For trials that achieved noninferiority, we checked if the intervention also proved superiority based on 95% CIs and whether the publication concluded superiority. For quality-of-life outcomes, we considered quality of life to be improved if the summary was statistically better; we did not examine each domain of the assessment tool separately. Finally, information on the hazard ratio (HR) and 95% CI for OS were extracted from the published reports for pooled analysis. If a CI different from a 95% CI was reported, we recalculated the CI to a 95% CI.

Statistical Analysis
The associations of the justification for using the noninferiority design and success in achieving noninferiority with the funding source were assessed using Fisher exact tests. The overall association of the trial drugs on OS was assessed by pooling the HRs across the trials using a random-effects meta-analysis to account for heterogeneity. Heterogeneity among studies was assessed using the Cochrane Q statistic (assumption of homogeneity was considered invalid for values of P < .10) and quantified using an I 2 test. Subgroup analyses were prespecified and included funding, blinding, and success. All statistical analyses were conducted using Stata version 15 (StataCorp), and a 2-sided

Results
Among 128 randomized noninferiority trials in oncology in adults identified through our search, 74 (58%) were drug trials. Of those, 23 (31%) enrolled 21 437 patients, used OS as the primary or coprimary end point, and were therefore included in our analysis (Figure 1 and Table 1).  None of these trials were blinded. Approximately half (12 [52%]) had industry funding, 5 (22%) were publicly funded, and 6 (26%) had mixed funding.

Association With OS
The HR was more than 1 in 10 trials (41%), 10,12,17,19,22,24,26,28,30,31 but was significant in only 1 trial 28 (Table 2). When the HRs across trials were pooled using random-effects meta-analysis, there was no beneficial or detrimental association with patient survival (pooled HR, 0.97; 95% CI, 0.92-1.02); the heterogeneity among the trials was substantial as trials across different tumor types were pooled (I 2 = 53.8%, P = .001) (Figure 2). This analysis was repeated using comparisons for each cohort of the trial by Shitara et al 30 (the trial with 3 arms) independently and results did not change. On subgroup analysis, there was no difference between industry-funded trials vs mixed or publicly funded trials as well as no difference between trials in which noninferiority design was justified vs not justified.

Discussion
In systematically searching for noninferiority trials of cancer drugs, we found that only 31% used OS as the primary end point. Among these trials, the criteria to define noninferiority varied from 1.08 to 1.33 for the upper limit of the CI of the HR for death. Ease of administration (oral formulation vs injectable control drugs) was the most common justification for the noninferiority design, while we found insufficient justification for 40% of such trials. Most trials were successful in proving noninferiority, and when the hazard ratio was pooled across these trials, there was no detrimental effect on OS.
Noninferiority trials in oncology have previously been criticized for "critical deficiencies in design and reporting." 33 In this cohort, OS was the primary end point in fewer than one-third of noninferiority trials. However, using a noninferiority hypothesis implies a willingness to compromise efficacy outcomes to achieve benefits in other areas. Surrogate measures, such as progression-free survival, have been shown to overestimate (or, in the case of immunotherapy drugs, underestimate 34 ) benefit and translate to smaller than expected gains in OS. In most cases these surrogate measures have been shown to lack correlation with OS. 35 Thus, a trial testing noninferiority in a surrogate measure is ethically challenging if it is impossible to estimate the magnitude of potential compromise in efficacy and communicate that clearly to patients participating in such trials as part of the informed consent process. Investigators should therefore be wary of conducting trials assessing noninferiority on surrogate measures.
The criteria to define noninferiority among the trials in the cohort varied from 1.08 to 1.33 for the upper bound of the CI of the HR for OS, which means from an 8% to a 33% increase in the hazard of death was considered acceptable (noninferior) in these trials. Furthermore, in multiple cases, this upper limit was defined not for a 95% CI but for a 90% or even an 80% CI. As previous studies have shown, there are no set methods for determining limits defining noninferiority. 3,33 We also found none of the 4 key justifications-lower toxic effects, lower cost, ease of administration, or better quality of life-for using noninferiority designs in approximately 40% of the cohort. Such benefits should be the primary rationale for a patient consenting to participate in a trial testing if a new treatment is not worse than the standard treatment by a prespecified margin.
Our final objective was to assess if patients enrolled in the experimental arm of noninferiority trials experienced reduced OS compared with those in the control arms. Reassuringly, we found no  such association. However, this pooled result should be interpreted with caution because, when examining individual trials, we found 10 trials (43%) in which the HR was more than 1, including 1 in which the HR was significantly more than 1. Similarly, 1 noninferiority trial actually proved superiority for the experimental arm. Subgroup analyses revealed no differences in OS based on funding or reporting 1 of the 4 key justifications for noninferiority.
Half of the trials in our study had industry funding, and having industry funding was significantly associated with missing the justifications we identified for the noninferiority design. Industryfunded noninferiority trials also were successful in proving noninferiority in 83% of cases, although the association of success with funding was not significant. Our results are similar to those of a previous analysis 4 showing that 43% of noninferiority trials in oncology were industry funded, 73% reported positive results, and OS was the primary end point in 25%. Another study 36 that included all noninferiority trials across disciplines showed that 83% produced favorable results irrespective of funding source, similar to our findings. These results should be interpreted in light of the fact that success in achieving noninferiority depends not only on the intervention but also on the criteria used to define noninferiority, which is often arbitrary. Furthermore, we cannot rule out publication bias against noninferiority trials that failed to achieve noninferiority.
These findings are important for those helping design and oversee the conduct of noninferiority trials in oncology. Noninferiority trials may be attractive because of the high probability of success. 8 Indeed, the noninferiority design has been described as having a low risk of failure. 36 However, our data show that institutional review boards and drug regulators should take an active role in adjudicating whether the noninferiority design is acceptable for the given question. When noninferiority design trials are considered important, the criteria to define noninferiority should be clearly defined based on a widely accepted rationale and should incorporate patient input.
The Consolidated Standards of Reporting Trials (CONSORT) statement on reporting noninferiority trials recommends providing a rationale for the noninferiority design and the criteria for defining noninferiority 37 ; however, further research is needed to assess if this recommendation has improved reporting practices in oncology. Similarly, the US Food and Drug Administration has issued a guidance for industry on noninferiority trials. However, the US Food and Drug Administration recommends a noninferiority design be chosen "when it would not be ethical to use a placebo." 38 While that is necessary, it is not sufficient, and superiority design trials against an active comparator should be encouraged unless the noninferiority design is justified for other compelling reasons, such as the ones we have mentioned. The guidance could also be improved by highlighting the need to incorporate patient input on the acceptable margin for defining noninferiority among various tumor types.

Limitations
This study has limitations. Although our analysis included trials involving more than 21 000 participants, for the analysis of the association of different parameters with funding, this study was limited by the relatively small total number of trials (23). Another important limitation is our consensus-based definition for adjudicating whether the noninferiority design was justified.
However, other authors have also concluded that "any new treatment, even if noninferior to standard treatment, should have some benefits, such as for quality of life, cost, or safety." 11 We focused on noninferiority trials testing OS end points; however, many trials also test noninferiority in surrogate measures, such as response rates. The association of such trials with patient treatment outcomes is a topic for future research.

Conclusions
Noninferiority randomized trials in oncology should be used only when there are important potential benefits that the experimental drug can offer patients. However, among 23 such trials testing OS, we found that a substantial fraction did not offer any of the 4 key criteria for justification. Greater