aIncludes 12 trials without design limitations but with immature outcome data.
eTable 1. Clinical trials with suboptimal control arms
eTable 2. Clinical trials with crossover errors
Customize your JAMA Network experience by selecting one or more topics from the list below.
Identify all potential conflicts of interest that might be relevant to your comment.
Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.
Err on the side of full disclosure.
If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.
Not all submitted comments are published. Please see our commenting policy for details.
Hilal T, Gonzalez-Velez M, Prasad V. Limitations in Clinical Trials Leading to Anticancer Drug Approvals by the US Food and Drug Administration. JAMA Intern Med. 2020;180(8):1108–1115. doi:10.1001/jamainternmed.2020.2250
How often are anticancer drugs approved by the US Food and Drug Administration (FDA) based on clinical trials with the following limitations: nonrandomized design, lack of demonstrated survival advantage, inappropriate use of crossover, or the use of suboptimal control arms?
In this observational study, 187 trials leading to anticancer drug approvals between June 30, 2014, and July 31, 2019, were reviewed. A total of 125 (67%) trials leading to anticancer drug indications had limitations in at least 1 of the 4 domains of interest.
Despite the increase in the number of drug approvals by the FDA, a substantial number of drugs are authorized based on data that do not demonstrate efficacy over established standards of care.
While there have been multiple assessments of clinical trials leading to anticancer drug approvals by the US Food and Drug Administration (FDA), the cumulative percentage of approvals based on trials with a limitation remains uncertain.
To assess the percentage of clinical trials with limitations in 4 domains—lack of randomization, lack of significant overall survival advantage, inappropriate use of crossover, and use of suboptimal control arms—that led to FDA approvals from June 30, 2014, to July 31, 2019.
Design, Setting, and Participants
This observational analysis included all anticancer drug indications approved by the FDA from June 30, 2014, through July 31, 2019. All indications were investigated, and each clinical trial was evaluated for design, enrollment period, primary end points, and presence of a limitation in the domains of interest. The standard-of-care therapy was determined by evaluating the literature and published guidelines 1 year prior to the start of clinical trial enrollment. Crossover was examined and evaluated for optimal use. The percentage of approvals based on clinical trials with any or all limitations of interest was then calculated.
Main Outcomes and Measures
Estimated percentage of clinical trials with limitations of interest that led to an anticancer drug marketing authorization by the FDA.
A total of 187 trials leading to 176 approvals for 75 distinct novel anticancer drugs by the FDA were evaluated. Sixty-four (34%) were single-arm clinical trials, and 123 (63%) were randomized clinical trials. A total of 125 (67%) had at least 1 limitation in the domains of interest; 60 of the 125 trials (48%) were randomized clinical trials. Of all 123 randomized clinical trials, 37 (30%) lacked overall survival benefit, 31 (25%) had a suboptimal control, and 17 (14%) used crossover inappropriately.
Conclusions and Relevance
Two-thirds of cancer drugs are approved based on clinical trials with limitations in at least 1 of 4 essential domains. Efforts to minimize these limitations at the time of clinical trial design are essential to ensure that new anticancer drugs truly improve patient outcomes over current standards.
Clinical trials leading to marketing authorization of anticancer drugs by the US Food and Drug Administration (FDA) are heterogeneous, with varying strengths and weaknesses. Nonrandomized clinical trials that show tumor shrinkage in response to a novel therapy are limited by uncertainty as to whether these agents are superior to the prevailing standard of care or if they improve survival or quality of life. When randomized clinical trials (RCTs) are conducted, limitations of interest may be related to trial design, for example, using a control arm that is considered suboptimal or inappropriate use of crossover, or in outcome, such as failing to demonstrate an overall survival (OS) benefit when an improvement in a surrogate end point is met.
Prior studies have characterized the frequency of single-arm studies leading to drug approval1 and the use of surrogate end points.2 We previously assessed the frequency of substandard control arms.3 However, these studies did not assess errors of crossover and the cumulative percentage of these limitations coexisting in the same trial. For example, what percentage of FDA approvals are made on the basis of improved survival in a trial without limitations?
Crossover in cancer RCTs occurs when a patient randomized to the control arm is given the investigational therapy after disease progression or toxic effects (unidirectional crossover). There are 2 errors of crossover in trials. The first occurs when crossover from the control arm to the investigational agent is allowed without established efficacy of the investigational agent. In these cases, crossover from control arm to the investigational agent can confound interpretations of end points, such as OS, and may even lead to spurious survival benefits.4,5 For instance, in a study of a novel, unproven cancer therapeutic vaccine in prostate cancer, crossover resulted in fewer patients receiving docetaxel and only at a delayed time point, which may have harmed the control arm.6 Survival differences in this setting could be due to either a successful therapy or harm to the control arm.
The second error is to omit crossover when a drug has proven benefit in a subsequent line of therapy, when a trial seeks to advance the agent to the frontline setting. For instance, in a study evaluating pembrolizumab, an immune checkpoint inhibitor, for previously untreated programmed cell death ligand 1 (PD-L1)–expressing metastatic non–small cell lung cancer, omission of crossover resulted in fewer patients being offered pembrolizumab—an agent that had been approved in the second-line setting—after progression on the control arm.7 Here, crossover is mandatory, and its absence may lead to the false inference that early administration is superior to the current standard of care (ie, sequential treatment). We sought to perform a single analysis that examined all 4 of the aforementioned limitations in a modern cohort of cancer drug approvals using a comprehensive resource and estimate the frequency of these limitations coexisting in the same trials.
We sought to assess what percentage of clinical trials leading to new or supplemental marketing approvals of anticancer drugs by the FDA had any of the following limitations: (1) nonrandomized clinical trial design, (2) RCTs that failed to show an OS advantage, (3) RCTs that used a suboptimal control, and (4) RCTs that inappropriately used crossover. This study of published reports did not involve patient-identifying data and was not submitted for institutional review board approval.
We examined all approvals by the FDA from June 30, 2014, through July 31, 2019. Inclusion criteria were all indications for every single novel anticancer drug approval in adults (≥18 years). Novel anticancer drugs were identified using the FDA hematology/oncology approvals and safety notifications web page8 and tabulated. Then, the name of each novel anticancer drug was entered into the FDA approved drug products search engine.9 Approval date(s), history, and labels (including new indications) were extracted. Notably, our prior study of control arms relied exclusively on the FDA hematology/oncology approvals and safety notifications web page,8 which does not report on new or expanded marketing approvals for an already approved investigational agent (eg, ibrutinib vs ofatumomab in previously treated chronic lymphocytic leukemia).
Every clinical trial cited on the drug label at the time of marketing authorization as the basis for an FDA approval was identified using the National Clinical Trial identifier on the label and confirmed by reviewing the FDA approvals and safety notifications web page when listed. The trial article was identified using PubMed, and the protocols were reviewed if available with the published article in the supplement. The FDA statistical review reports were not used because many were not accessible and/or not available for supplementary indication approvals. From the article of each trial, we identified the accrual period, setting of the clinical trial (national vs international), indication, control arm, primary end point, OS end point, and the presence or absence of crossover in RCTs.
For each RCT, we assessed the quality of the control arm as suboptimal if (a) restrictions were placed on the choice of control that excluded another potentially equivalent agent, or (b) the control arm was specified but not the recommended agent and potentially inferior (eg, the control arm was a single agent when doublet therapy is recommended). We then evaluated whether a suboptimal control arm was chosen because of the international scope of the trial and would not have been considered a US standard-of-care option.
We assessed control arms using 2 methods independently. First, the first and second authors (T.H. and M.G.-V.) performed a search of the National Comprehensive Cancer Network (NCCN) guidelines through the Journal of the National Comprehensive Cancer Network (JNCCN) dated at least 1 year prior to the start of accrual of an RCT of interest that led to an FDA marketing authorization to determine the standard-of-care therapy for each specific cancer. When guidelines were not available for the year of interest in JNCCN, we used the Wayback Machine, a digital archive that stored previous versions of NCCN guidelines. Second, the first and second authors separately and independently read the published clinical trial data presented in articles as well as the appendices and supplements, when relevant, and determined the adequacy of the control arm compared with what would be considered the standard of care 1 year prior to the start of trial accrual. Conflicts were resolved by the third author (V.P.).
We assessed for the presence or absence of protocol-specified unidirectional crossover from published articles and by searching protocols available with the articles. Two authors (T.H. and M.G.-V.) determined separately and independently whether the presence or absence of crossover was desirable based on established definitions.5
Appropriate crossover was defined as allowing crossover in situations where the efficacy of the investigational agent had already been established from a previous RCT in a latter line of therapy (eg, second line or beyond), had an FDA approval in a latter line of therapy, or was considered the standard of care in a subsequent line at the time of or within 1 year of enrollment of participants. In these situations, the absence of crossover in the protocol or the absence of a protocol amendment was deemed inappropriate.
Inappropriate crossover was defined as use of crossover in situations where the fundamental efficacy of an experimental agent had not been established in a prior RCT, and/or an FDA approval was not available at the time of or within 1 year of enrollment of participants. In these situations, the presence of crossover was considered inappropriate, as it has potential to obscure signals of true benefit (eg, OS advantage) or harm from the investigational agent (both arms of the trial will receive it). A protocol amendment made during study periods to allow crossover when a drug was approved by the FDA or an RCT confirmed its efficacy in a latter line setting was considered appropriate.
Descriptive statistics are reported throughout. We analyzed the study data from November 1 to November 20, 2019.
Between June 30, 2014, and July 31, 2019, the FDA granted 176 approvals for 75 distinct novel anticancer drugs based on 187 trials. The number of anticancer trials leading to FDA approval doubled over time with 68 in the first half of the study period (June 2014 to December 2016) to 119 in the second half of the study period (January 2017 to July 2019). Of the 187 trials, 123 (66%) were RCTs, and 64 (34%) were nonrandomized clinical trials. Of the 187 trials, 38 (20%) were for lymphoid malignant neoplasms, 37 (20%) for lung and head and neck malignant neoplasms, and 19 (10%) were for genitourinary malignant neoplasms. To better characterize these limitations, we separated them into limitations in design (uncontrolled study, suboptimal control, inappropriate use of crossover) and limitation in outcome (lack of OS benefit).
We found that 64 of 187 (34%) pivotal trials lacked a control arm. Two drug indications were based on data from a subset of patients in an open-label phase 1b trial (eg, KEYNOTE-013, pembrolizumab in refractory primary mediastinal B-cell lymphoma, after 2 or more lines of therapy10) and a post hoc analysis of a subset of patients from multiple trials (eg, LUX-Lung, afatinib in first-line metastatic non–small cell lung cancer without resistant EGFR mutations11). The remainder were largely single trials or pooled analyses of 2 single-arm trials. The primary end point was overall response rate in 53 trials (83%) and complete remission in 5 trials (8%). The majority of marketing approvals based on nonrandomized clinical trials (43 trials [67%]) were granted under the accelerated approval program.
There were 123 RCTs leading to 120 approvals. The majority of drug indication approvals (117 [97%]) were based on data from a single RCT, while the remainder (3 [3%]) were based on data from 2 RCTs. The majority of approvals based on RCTs (110 [92%)] were regular approvals. Of the 123 RCTs, 1 was a noninferiority trial and 122 were superiority trials.
Of the 122 RCTs powered for superiority, 31 (25%) had a suboptimal control arm. A list of all RCTs with a suboptimal control arm and the reasons they were deemed suboptimal is provided in eTable 1 in the Supplement. When categorized by type of suboptimal control, 22 (71%) clinical trials omitted active treatment in the control arm by using a control known or likely to be inferior to other available agents or not allowing combinations, and 9 (29%) limited the investigator’s choice in selecting an active treatment. When assessed by whether the international scope of the trial led to a suboptimal choice of the control arm in the US, 3 (10%) trials chose a control arm that was deemed more accessible outside the US but that may not have been the treatment of choice in the US. Of the 31 RCTs with a suboptimal control arm leading to FDA approval, 1 was reversed due to a subsequent phase 3 trial showing a lack of superiority over the control.12
Of the 122 clinical trials powered for superiority, 17 (14%) had errors in crossover (eTable 2 in the Supplement). Of those, 8 (47%) prespecified crossover in the protocol when crossover was not desirable (ie, crossover to the investigational agent was allowed on disease progression in the control without previous studies or FDA approvals establishing efficacy of the investigational treatment in a latter line of therapy), and 9 (53%) did not prespecify crossover in the protocol when crossover was desirable (ie, crossover to the investigational agent was not allowed despite the established efficacy and/or FDA approval of the investigational agent in a latter line of therapy).
The primary end point was progression-free survival in 63 of 122 clinical trials (51%), OS (primary or coprimary) in 38 (30%), and an alternative surrogate end point in 23 (19%). Overall survival was either a primary or a secondary end point in 121 RCTs (98%). One was a noninferiority trial with OS as a primary end point.
Of the 122 RCTs powered for superiority, OS was superior in the investigational arm in 65 trials (52%), failed to show advantage in the investigational arm in 36 trials (30%), was not reported at the time of analysis in 19 trials (16%), and was not a prespecified end point in 2 trials (2%), one of which would have been desirable.
Among 187 trials, we found that 125 (67%) were trials with limitations in design and outcome. Specifically, 106 (57%) had limitations in design, 37 (20%) had limitations in outcome, and 18 (10%) had concurrent design and outcome limitations (Figure). The Table summarizes the limitations among RCTs and those without mature OS data as of November 2019.
Our results show that of 187 anticancer drug trials leading to 176 marketing authorizations by the FDA over a 5-year period, 125 (67%) had at least 1 of 4 limitations in design (control arm, crossover, single arm) and/or lack of OS benefit (including 1 noninferiority trial). Our findings raise important concerns.
Nonrandomized clinical trials constitute the basis for one-third of all marketing authorizations. Although results of nonrandomized clinical trials are markers of drug activity, many drugs approved on the basis of these trials exaggerate treatment efficacy when tested in RCTs.13,14 Furthermore, when evaluated by value scales (eg, the European Society for Medical Oncology Magnitude of Clinical Benefit Scale, only one-third of single-arm trials were shown to meet the criteria for substantial clinical benefit.15,16 To balance the risk and benefit of early market authorization of investigational agents without proven superiority over standards of care, the accelerated approval pathway was developed by the FDA. Accelerated approval expedites the availability of potentially effective therapies with the requirement to conduct postapproval confirmatory trials. However, we found that approximately one-third of these approvals (21 of 64 [32%]) are regular approvals and not subject to confirmatory efficacy trials. This leaves substantial uncertainty as to their overall benefit over prevailing standards of care. Previous work estimated that only 20% of anticancer drugs receiving accelerated approval are shown to improve survival, although some studies remain ongoing.17
Surrogate end points were common, with 86 of 122 RCTs (70%) having a surrogate end point as the primary end point. Approvals for new drugs based on surrogate end points should be limited to specific circumstances where limited treatment options exist, the possible benefit is high, and the likelihood of harm is low. Overall survival, considered the criterion standard clinical end point, particularly for lethal conditions, was almost always assessed in RCTs (98%). However, approximately one-third (30%) of anticancer drug approvals based on RCTs failed to show a statistically significant improvement in OS. Approximately half of all trials (46%) had either unknown effects on OS or failed to show gains in OS. Our results show similar findings to previous reports, wherein two-thirds of cancer drugs were approved on the basis of a surrogate end point and half were reported to have unknown effects or failed to show gains in OS.2,18,19
Although crossover is often cited as a reason for failure to see an OS gain after an improvement in a surrogate end point, we found that only in the minority of clinical trials (17 of 122 [14%]) could the absence of an OS advantage be due to inappropriate crossover. This finding suggests that either the investigational agents are not effective in improving OS or that the trial was not powered to detect an OS benefit.20 Many of the trials that failed to show an OS benefit were of anticancer drugs used for treating relatively indolent malignant neoplasms for which postprogression therapy or crossover was prevalent (eg, ibrutinib plus rituximab vs placebo plus rituximab for Waldenstrom macroglobulinemia). Another group of approvals that failed to show an OS benefit were of maintenance therapies in which OS can be difficult to measure owing to the use of subsequent lines of therapy (eg, rucaparib maintenance therapy in recurrent ovarian cancer) (eTables 1 and 2 in the Supplement).
Suboptimal control arms in our study were similar to prior reports in this comprehensive data set that included multiple indications for the same agent (25% vs 17%, respectively).3 The use of a substandard control arm may result in a trial that is more likely to be positive (ie, meet its primary end point) but prevents the trial from addressing the clinically relevant question: Is this new drug better than the current standard of care?
Errors in use of crossover were estimated at approximately 14% of RCTs—that is, trials allowed crossover to the investigational agent without proven efficacy or FDA approval in subsequent lines of therapy or omitted crossover to drugs with established efficacy or FDA approval in subsequent lines of therapy. Allowing patients in the control arm to receive the investigational agent may result in diminution of any effect on OS21 and is often cited as a reason for cancer trials not demonstrating and OS benefit. In our analysis, only half of the errors in crossover were due to crossover to an unapproved intervention (ie, investigational agent without established efficacy in a latter line of therapy). The opposite error, prohibiting crossover to an approved intervention (ie, investigational agent with established efficacy in a latter line of therapy), may lead to an overestimation of the benefit seen with the investigational agent because patients in the control arm are deprived of an accepted salvage therapy. This type of error was seen in half of the cases in our analysis (ie, no protocol-specified crossover design despite it being more appropriate given that the intervention was an FDA-approved drug in the later-line setting).
The FDA has commented on the ethical considerations with regard to crossover and has been supportive of early crossover in RCTs when a surrogate primary end point (eg, progression-free survival, overall response rate) is met.22,23 In such trials, patients in the control arm would be allowed to cross over to the investigational agent after a prespecified analysis demonstrates efficacy in a surrogate (eg, response rate or tumor shrinkage). In our analysis, when a protocol amendment allowed crossover due to interim analysis meeting an efficacy end point, we conservatively considered such crossover appropriate. Yet we note that this strategy may limit the ability of a drug to demonstrate an improvement in OS (if one exists) or alternatively may limit the ability to demonstrate a decrease in survival (harm) that may be a late effect of the investigational agent that both arms of the trial will be exposed to. Finally, crossover is not associated with faster trial enrollment, as some hypothesize.24 Although multiple statistical methods have been developed to model OS in these situations, assuming crossover had not occurred, all such models rely on assumptions regarding the balance of a drug’s on-target and off-target effects, and none of these methods are without their own limitations.25
Our analysis sought to evaluate the presence of clinically relevant limitations of interest in clinical trials leading to marketing authorizations over a 5-year period. Critically addressing limitations during the design of clinical trials can improve the quality of evidence on which we base anticancer drug approvals, decrease erroneous conclusions, and focus more on hard end points (eg, OS). Our findings are complementary to a 2019 analysis26 that evaluated risk of bias in RCTs supporting approvals in Europe between 2014 and 2016 using the Cochrane risk of bias tool, which assesses different domains than our study. The trial limitations we included in our analysis address questions faced by practicing oncologists.
The main limitation of our analysis is subjectivity in the assessment of acceptable standard of care and the appropriateness of the use of crossover. We attempted to limit subjectivity by individually and separately reviewing the guidelines and establishing consensus standard of care for each malignant neoplasm. Furthermore, whether crossover was specified in the protocol was not always reported in the article, especially when crossover was not allowed. In these cases, the protocol was reviewed, when available, and when no mention was found, lack of protocol-specified crossover was assumed. Postprogression therapy was not always reported in the article, nor the supplement, so non–protocol-specified crossover from the control arm to an agent similar to the investigational agent in the trial (eg, a programmed cell death 1 [PD-1] inhibitor) was not always captured. This made it difficult to interpret the data in light of real-world use of anticancer drugs. For example, in the OCEANS trial,27 crossover to bevacizumab on progression was not allowed, but 38% of patients who progressed in the control arm received bevacizumab off protocol. Finally, other important design flaws that may limit the validity of trial results were not captured in our limitations. For example, in the PACIFIC trial,28 standard imaging techniques such as positron emission tomography/computed tomography and brain magnetic resonance imaging for staging were not done prior to enrolling participants. This may have enriched the trial for patients with undiagnosed stage IV non–small cell lung cancer, some of whom received active therapy in the form of durvalumab while others received placebo. Finally, it is inevitable that others may disagree with our categorization, and we encourage further study.
In this study, we found that 67% of trials that led to FDA approval of anticancer agents had 1 or more limitations that include lack of randomization, lack of significant OS benefit, use of suboptimal control arm, and errors in the use of crossover. These limitations identify trials that do not address the clinically relevant question of whether patients will live longer or better lives if a novel agent is used over the current standard of care. As such, trial design and end point should be carefully addressed prior to enrollment to ensure that new anticancer drugs are superior to what most patients would receive in daily practice.
Accepted for Publication: April 28, 2020.
Corresponding Author: Talal Hilal, MD, Division of Hematology-Oncology, University of Mississippi Medical Center, 2500 N State St, Jackson, MS 39216 (email@example.com).
Published Online: June 15, 2020. doi:10.1001/jamainternmed.2020.2250
Author Contributions: Dr Hilal had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Hilal, Prasad.
Acquisition, analysis, or interpretation of data: Hilal, Gonzalez-Velez.
Drafting of the manuscript: Hilal.
Critical revision of the manuscript for important intellectual content: All authors.
Administrative, technical, or material support: Gonzalez-Velez.
Study supervision: Prasad.
Conflict of Interest Disclosures: Dr Prasad reported receiving research funding from Arnold Ventures; royalties from Johns Hopkins Press and Medscape; honoraria from grand rounds/lectures from universities, medical centers, nonprofits, and professional societies; and consulting fees from UnitedHealthcare; speaking fees from eviCore; and hosting the Plenary Session podcast, which has Patreon backers. No other disclosures were reported.