Response Rate as a Regulatory End Point in Single-Arm Studies of Advanced Solid Tumors | Targeted and Immune Cancer Therapy | JAMA Oncology | JAMA Network
[Skip to Navigation]
Sign In
Figure 1.  Forest Plots of Individual Trial Arms for 3 Common Lung Cancer Regimens
Forest Plots of Individual Trial Arms for 3 Common Lung Cancer Regimens

Maximum objective response rate (ORR) was calculated from the lower bounds of the 95% confidence interval (CI) around the reported ORR rather than from the reported ORR itself. Mean ORR was calculated as a weighted mean incorporating the size of each trial arm. A, For erlotinib, the maximum ORR of 73%, from a trial studying a molecularly enriched population, is much higher than the mean ORR of 13%, which heavily weighs a large expanded access trial (marked with footnote a). B, For docetaxel, the maximum ORR of 25% was derived from the smallest trial (n = 33), such that the lower bound of the CI is much lower than the trial’s reported RR of 42%. C, For carboplatin plus paclitaxel, the maximum ORR of 31% is calculated from a larger trial with a reported RR of 35% and a narrow CI, which was used rather than a smaller trial with higher reported ORR (37%) but a wider CI. Data markers represent RRs; horizontal lines, the 95% CIs, with marker size reflecting the statistical weight of the study.

Figure 2.  Association of Regulatory Approval With Maximum Objective Response Rate (ORR)
Association of Regulatory Approval With Maximum Objective Response Rate (ORR)

A and B, Across all studies, probability of regulatory approval increases with increased maximum ORR or mean overall ORR. C-F, When limited to single-agent therapies, this relationship is stronger than with combination therapies. C, For single-agent therapies, probability of regulatory approval plateaus above a maximum ORR of approximately 45%. Each panel displays a best-fitting curve obtained by nonparametric logistic regression using local likelihood as implemented in the sm library of R, where the bandwith was chosen by cross-validation. Solid line indicates estimated probability of regulatory approval as a function of the response rate, and dashed lines, the 95% CIs of the estimated probabilities. Symbols indicating value of 0 or 1.0 may be slightly offset for visual clarity.

Table.  A Range of Statistical End Points for Single-Arm Trials and the Estimated Test Characteristics for Their Ability to Predict for Regulatory Approval
A Range of Statistical End Points for Single-Arm Trials and the Estimated Test Characteristics for Their Ability to Predict for Regulatory Approval
1.
Kwak  EL, Bang  Y-J, Camidge  DR,  et al.  Anaplastic lymphoma kinase inhibition in non-small-cell lung cancer.  N Engl J Med. 2010;363(18):1693-1703.PubMedGoogle ScholarCrossref
2.
Flaherty  KT, Puzanov  I, Kim  KB,  et al.  Inhibition of mutated, activated BRAF in metastatic melanoma.  N Engl J Med. 2010;363(9):809-819.PubMedGoogle ScholarCrossref
3.
Byrd  JC, O’Brien  S, James  DF.  Ibrutinib in relapsed chronic lymphocytic leukemia.  N Engl J Med. 2013;369(13):1278-1279.PubMedGoogle Scholar
4.
Sherman  RE, Li  J, Shapley  S, Robb  M, Woodcock  J.  Expediting drug development—the FDA’s new “breakthrough therapy” designation.  N Engl J Med. 2013;369(20):1877-1880.PubMedGoogle ScholarCrossref
5.
Hay  M, Thomas  DW, Craighead  JL, Economides  C, Rosenthal  J.  Clinical development success rates for investigational drugs.  Nat Biotechnol. 2014;32(1):40-51.PubMedGoogle ScholarCrossref
6.
Lipson  D, Capelletti  M, Yelensky  R,  et al.  Identification of new ALK and RET gene fusions from colorectal and lung cancer biopsies.  Nat Med. 2012;18(3):382-384.PubMedGoogle ScholarCrossref
7.
Beadling  C, Jacobson-Dunlop  E, Hodi  FS,  et al.  KIT gene mutations and copy number in melanoma subtypes.  Clin Cancer Res. 2008;14(21):6821-6828.PubMedGoogle ScholarCrossref
8.
Kwok  M, Foster  T, Steinberg  M.  Expedited programs for serious conditions: an update on breakthrough therapy designation.  Clin Ther. 2015;37(9):2104-2120.PubMedGoogle ScholarCrossref
9.
Cavallo  J. A look ahead: how the FDA is adapting in the era of precision medicine. ASCO Post. 2013;4(17). http://www.ascopost.com/issues/november-1,-2013/a-look-ahead-how-the-fda-is-adapting-in-the-era-of-precision-medicine.aspx. Accessed February 3, 2016.
10.
Kesselheim  AS, Wang  B, Franklin  JM, Darrow  JJ.  Trends in utilization of FDA expedited drug development and approval programs, 1987-2014: cohort study.  BMJ. 2015;351:h4633.PubMedGoogle ScholarCrossref
11.
Lamont  EB, Hayreh  D, Pickett  KE,  et al.  Is patient travel distance associated with survival on phase II clinical trials in oncology?  J Natl Cancer Inst. 2003;95(18):1370-1375.PubMedGoogle ScholarCrossref
12.
Oxnard  GR, Morris  MJ, Hodi  FS,  et al.  When progressive disease does not mean treatment failure: reconsidering the criteria for progression.  J Natl Cancer Inst. 2012;104(20):1534-1541.PubMedGoogle ScholarCrossref
13.
Dahlberg  SE, Shapiro  GI, Clark  JW, Johnson  BE.  Evaluation of statistical designs in phase I expansion cohorts: the Dana-Farber/Harvard Cancer Center experience.  J Natl Cancer Inst. 2014;106(7):dju163.PubMedGoogle ScholarCrossref
14.
Blumenthal  GM, Karuri  SW, Zhang  H,  et al.  Overall response rate, progression-free survival, and overall survival with targeted and standard therapies in advanced non-small-cell lung cancer: US Food and Drug Administration trial-level and patient-level analyses.  J Clin Oncol. 2015;33(9):1008-1014.PubMedGoogle ScholarCrossref
15.
Hirsch  BR, Califf  RM, Cheng  SK,  et al.  Characteristics of oncology clinical trials: insights from a systematic analysis of ClinicalTrials.gov.  JAMA Intern Med. 2013;173(11):972-979.PubMedGoogle ScholarCrossref
16.
Therasse  P, Arbuck  SG, Eisenhauer  EA,  et al.  New guidelines to evaluate the response to treatment in solid tumors.  J Natl Cancer Inst. 2000;92(3):205-216.PubMedGoogle ScholarCrossref
17.
Eisenhauer  EA, Therasse  P, Bogaerts  J,  et al.  New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1).  Eur J Cancer. 2009;45(2):228-247.PubMedGoogle ScholarCrossref
18.
Bowman  AW, Azzalini  A.  Computational aspects of nonparametric smoothing with illustrations from the sm library.  Comput Stat Data Anal. 2003;42(4):545-560.Google ScholarCrossref
19.
Kruskal  WH.  Ordinal measures of association.  J A Stat Assoc. 1958;53(284):814-861.Google ScholarCrossref
20.
DeLong  ER, DeLong  DM, Clarke-Pearson  DL.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.  Biometrics. 1988;44(3):837-845.PubMedGoogle ScholarCrossref
21.
Oxnard  GR.  Strategies for overcoming acquired resistance to epidermal growth factor receptor: targeted therapies in lung cancer.  Arch Pathol Lab Med. 2012;136(10):1205-1209.PubMedGoogle ScholarCrossref
22.
Jänne  PA, Yang  JC-H, Kim  D-W,  et al.  AZD9291 in EGFR inhibitor-resistant non-small-cell lung cancer.  N Engl J Med. 2015;372(18):1689-1699.PubMedGoogle ScholarCrossref
23.
Cross  DAE, Ashton  SE, Ghiorghiu  S,  et al.  AZD9291, an irreversible EGFR TKI, overcomes T790M-mediated resistance to EGFR inhibitors in lung cancer.  Cancer Discov. 2014;4(9):1046-1061.PubMedGoogle ScholarCrossref
24.
Jänne  PA, Ou  S-HI, Kim  D-W,  et al.  Dacomitinib as first-line treatment in patients with clinically or molecularly selected advanced non-small-cell lung cancer: a multicentre, open-label, phase 2 trial.  Lancet Oncol. 2014;15(13):1433-1441.PubMedGoogle ScholarCrossref
25.
Rajakulendran  T, Adam  DN.  Spotlight on pembrolizumab in the treatment of advanced melanoma.  Drug Des Devel Ther. 2015;9:2883-2886.PubMedGoogle Scholar
26.
Simon  R.  Optimal two-stage designs for phase II clinical trials.  Control Clin Trials. 1989;10(1):1-10.PubMedGoogle ScholarCrossref
27.
Molecular Analysis for Therapy Choice (NCI MATCH). 2015. http://dctd.cancer.gov/MajorInitiatives/NCI-MATCH.pdf. Accessed November 10, 2015.
28.
Shaw  AT, Ou  S-HI, Bang  Y-J,  et al.  Crizotinib in ROS1-rearranged non-small-cell lung cancer.  N Engl J Med. 2014;371(21):1963-1971.PubMedGoogle ScholarCrossref
Original Investigation
June 2016

Response Rate as a Regulatory End Point in Single-Arm Studies of Advanced Solid Tumors

Author Affiliations
  • 1Lowe Center for Thoracic Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts
  • 2Department of Biostatistics and Epidemiology, Memorial Sloan Kettering Cancer Center, New York
  • 3Department of Radiology, Columbia University College of Physicians and Surgeons and New York Presbyterian Hospital, New York
  • 4Duke Cancer Care Research Program, Duke Cancer Institute, Dallas, Texas
JAMA Oncol. 2016;2(6):772-779. doi:10.1001/jamaoncol.2015.6315
Abstract

Importance  Objective response rate (ORR) is an increasingly important end point for accelerated development of highly active anticancer therapies, yet its relationship to regulatory approval is not well characterized.

Objective  To identify circumstances in which a high ORR is associated with regulatory approval, and therefore might be an appropriate end point for definitive single-arm studies of anticancer therapies.

Data Source  A database of all oncology clinical trials registered at clinicaltrials.gov between October 1, 2007, and September 30, 2010.

Study Selection  Trials of palliative systemic therapies for 4 measurable solid tumor types, limited to those with trial arms of at least 20 patients reporting ORR per Response Evaluation Criteria in Solid Tumors (RECIST).

Data Extraction and Synthesis  A systematic search was used to identify the reported ORR for each eligible treatment arm that had been presented publicly.

Main Outcomes and Measures  For each treatment regimen, defined as a single-agent or unique combination of agents for 1 cancer type, the mean ORR and the maximum ORR statistically exceeded were calculated, and their association with regulatory approval was studied. A regimen was considered approved for a specific cancer type if it had received regulatory approval in any country for treatment of advanced cancer of that type.

Results  From 1800 trials, 874 eligible trial arms in 578 eligible trials were identified; 542 arms had ORR data available for 294 regimens. Maximum ORR and mean ORR were significantly associated with regulatory approval (τ = 0.27, P < .001; τ = 0.12, P = .01); this relationship was stronger for single-agent therapies (τ = 0.49; τ = 0.41) than for combination regimens (τ = 0.28; τ = 0.17). Evaluation of ORR thresholds between 20% and 60% as potential trial end points demonstrated that ORR statistically exceeding 30% with a single agent had 98% specificity and 89% positive predictive value for identifying regimens achieving regulatory approval.

Conclusions and Relevance  For single-agent regimens, high ORR was associated with regulatory approval; this relationship was less strong for combination regimens. Our data suggest that high ORR (eg, statistically exceeding an ORR of 30%) is an appropriate end point for single-arm trials aiming to demonstrate breakthrough activity of a single-agent anticancer therapy.

Introduction

Drug development in oncology has undergone fundamental changes in recent years. The emergence of highly active targeted therapies for subsets of patients with advanced, refractory cancers is evidence of the oncology community’s improved ability to translate advances in cancer science into better therapies for patients.1-3 Yet the challenges of drug development in oncology are well known, with a typical clinical development cycle estimated to take 7 years,4 and with fewer than half of those drugs achieving phase 3 success.5 Adding to the challenge, many molecularly defined cancer subtypes are uncommon such that large definitive trials can be time consuming and potentially infeasible. For example, KIT-mutant melanoma and RET-rearranged lung adenocarcinoma each are estimated to occur in fewer than 1000 US patients annually.6,7 Because testing for these genotypes is uncommon in most clinic practices, only a small subset of these patients are ever identified so the apparent clinical incidence is even lower.

Toward the aim of facilitating regulatory review of highly active therapies for serious conditions like cancer, the US Food and Drug Administration (FDA) established in 2013 the “breakthrough therapy” designation.4,8 As of March 2015, a total of 19 drugs had been approved with the breakthrough therapy designation, with 9 of these (47%) being anticancer therapies.8 An increasingly discussed principle for accelerated development of highly active therapies is the potential for single-arm trials to be adequate to support regulatory approval,9 although confirmatory trials must confirm clinical benefit from such therapies to obtain full FDA approval.10 Design of single-arm trials demonstrating breakthrough activity is fraught with uncertainty due to the lack of a clear comparator group and the potential for selection bias, particularly with the favorable treatment outcomes reported for patients traveling to academic cancer centers.11

Objective response rate (ORR) on tumor imaging is one intuitive end point for single-arm trial design given its historical role in drug development. Objective response rate, historically calculated using a succession of interrelated response criteria,12 has been used since 1980 in phase 2 trials of advanced solid-tumor malignant neoplasms to screen for active treatment regimens. Because most solid tumors are relatively resistant to chemotherapy, a common phase 2 trial design is to study 20 to 50 patients toward the aim of demonstrating an “interesting” ORR and statistically exceeding an “uninteresting” ORR. Disproving the null hypothesis of a low ORR (eg, 5%-15%) could indicate a role for further study in larger trials. However, such single-arm trials studying ORR have not historically been used as definitive studies to support regulatory approval, and the optimal statistical design of such definitive single-arm studies has not been established. Increasingly, phase 1 trials are including expansion cohorts that double as single-arm studies of efficacy: in a recent institutional analysis of 522 phase 1 trials over 3 decades, sample size was shown to have increased dramatically since 2005.13 And yet, among 60 phase 1 trials carried out in 2011, the majority of protocols had no statistical justification for planned expansion cohorts despite a mean sample size of 30 patients.13

Toward the aim of studying appropriate statistical end points for definitive single-arm trials, we characterized the relationship between ORR and regulatory approval of anticancer therapies for common solid tumors. We chose to study therapies that had achieved regulatory approval because such agents, having completed a comprehensive regulatory review, are likely to harbor characteristics on which future drug development should be patterned. Specifically, we aimed to identify circumstances in which a high ORR is associated with regulatory approval and therefore might be an appropriate end point for definitive single-arm studies.

Box Section Ref ID

Key Points

  • Question: Under what circumstances is a high objective response rate (ORR) associated with regulatory approval of an anticancer regimen?

  • Findings: In this analysis of 1800 trials in advanced solid tumors, increased ORR was found to be significantly associated with likelihood of regulatory approval. This relationship was strongest for single-agent regimens, where 89% of regimens statistically exceeding an ORR of 30% had achieved regulatory approval.

  • Meaning: High ORR (eg, statistically exceeding an ORR of 30%) is an appropriate end point for single-arm trials aiming to demonstrate breakthrough activity of a single-agent anticancer therapy.

Methods

This analysis was designed to study the relationship between drug activity and eventual regulatory approval. Rather than studying conventional regulatory end points such as overall survival (OS) or progression-free survival (PFS), as has been done previously,14 this analysis focused solely on ORR, which is widely reported in most trials of solid tumors using well-established criteria. To generate an unbiased index of prospective trials over a multiyear period, we queried the Aggregate Analysis of Clinicaltrials.gov (AACT) database from the Clinical Trials Transformation Initiative at Duke University.15 The AACT database includes trials registered at ClinicalTrials.gov following 2007, the year clinical trial registration became required by the FDA. Each arm of multiarm trials was then analyzed separately as a single-arm trial to study the relationship between ORR and regulatory approval in that cancer type. By including control and experimental arms from randomized studies, results for both conventional therapies and newer therapies could be studied in a single-arm fashion.

Data Collection

As part of a prior analysis of the AACT database, 40 970 interventional trials registered between October 1, 2007, and September 30, 2010, were reviewed to identify 8942 trials related to oncology.15 This cohort was further limited to trials studying non–small-cell lung cancer (NSCLC), colorectal cancer, melanoma, and renal cell cancer (RCC) to study a relatively homogenous population of chemoresistant solid tumors easily assessed using computed tomographic imaging. Solid-tumors that are more difficult to measure (eg, ovarian or pancreatic cancer) or generally metastasize to unmeasurable bony disease (eg, prostate or breast cancer) were excluded. The analysis was confined to trials of palliative systemic therapy for advanced disease; trials studying local therapies, maintenance-only therapies, or cell-based therapies were excluded. Trials with fewer than 20 patients of an eligible cancer type were excluded as too small for meaningful statistical analysis. All arms of a multiarm trial (randomized or nonrandomized) were separated for study as individual single-arm studies. Arms with fewer than 20 patients of an eligible cancer type were excluded, as were arms studying multiple cancer types together (ie, many phase 1 trials), or arms not specifying a single treatment regimen.

A systematic search method was used to identify data that had been reported publicly to screen for eligible trials and collect ORR data; to overcome publication bias, data that had been reported publicly but not published were included in the analysis. A sequential list of resources was queried for results on each potentially eligible trial: (1) ClinicalTrials.gov, (2) PubMed, (3) abstract/presentation libraries from major oncology conferences. Finally, if results were not available from these sources, the principal investigator of the trial was personally emailed to request any reported data that had been missed in the aforementioned search. For each trial arm, ORR per the Response Evaluation Criteria in Solid Tumors (RECIST) was recorded.16,17 Results based on both RECIST edition 1.0 and 1.1 were considered acceptable, but response data based on any other criteria or “modified” RECIST criteria were excluded. If more than 1 ORR was reported, the ORR in the intent-to-treat population was selected for analysis.

For each trial arm, the regimen used was assigned a regimen ID that represented a unique combination of anticancer agents used for a specific cancer type, irrespective of the line of therapy or dosing strategy. Thus, erlotinib hydrochloride in NSCLC was considered 1 regimen regardless of the line of therapy or patient population studied; similarly, fluorouracil with irinotecan in colorectal cancer was considered 1 regimen regardless of the specific dose or delivery method used. A regimen was defined as having received regulatory approval if it was approved for commercial use in any country worldwide. This was determined first by a review of the websites of the US FDA and the European Medicines Agency; an unbiased Internet search was then used to screen for regimens with regulatory approval only in specific countries. The data set was locked as of April 1, 2015.

Statistical Analysis

For each unique regimen identified in the aforementioned search, 2 summary statistics were calculated across all trial arms studying that regimen in a given cancer type. First, the mean ORR was calculated to gauge activity broadly across all patients treated, and this was calculated in a weighted fashion across all trial arms. Specifically, mean ORR was defined as the proportion of all responders out of all evaluable patients across all trial arms. Second, the maximum ORR statistically exceeded was calculated to account for the possibility of increased activity in certain clinical settings; for statistical rigor, this was calculated from the lower bounds of the 95% confidence interval (CI) of the reported ORR rather than from the reported ORR itself. Using the 95% CI from each trial arm, maximum ORR was defined as the highest of the lower bounds, representing the null hypothesis (or “uninteresting” ORR), which was statistically excluded (eMethods in the Supplement). A relationship was then investigated between each of these summary statistics and regulatory approval as a dichotomous variable using nonparametric regression analysis.18 Kendall τ was used to assess the strength of this relationship.19 Analyses were stratified to compare single-agent vs combination regimens. Receiver operating characteristic curves were estimated using the empirical method and compared with a U test.20 All analyses were performed using R, version 3.0 (http://www.r-project.org).

Results
Trial Characteristics

From 8942 oncology trials in the AACT oncology database, we identified 1800 trials involving NSCLC, colorectal cancer, melanoma, or RCC. We excluded 1222 trials for the following reasons: 321 studied the wrong cancer diagnosis (eg, multiple cancer types in a phase 1 trial) or a noncancer diagnosis (eg, cancer screening), 203 studied patients without advanced disease, 310 enrolled fewer than 20 patients, and 388 studied ineligible therapies or end points (eFigure 1 in the Supplement). The remaining 578 eligible trials included 913 different treatment arms. Of these, 39 arms were excluded because enrollment was fewer than 20 patients or an ineligible therapy was studied, resulting in a cohort of 874 trial arms eligible for analysis. Of these trial arms, 542 reported ORR per RECIST and were chosen as the cohort for the planned analysis (eTable 1 in the Supplement).

Non–small-cell lung cancer was studied in the largest proportion of eligible trial arms (46%), while RCC and melanoma were studied in only 13%. The majority of trial arms were registered as phase 2 studies (60%) or as phase 3 studies (22%). Across the eligible trial arms, median enrollment was 52 (range, 20-5394; interquartile range, 36-98). Eligible trial arms studied 294 unique treatment regimens, of which 28% were single-agent and 72% were combination therapies. Fifteen percent of these treatment regimens had received regulatory approval somewhere worldwide at time of analysis (30% of single-agent regimens, 9% of combination regimens); none of the regimens with FDA approval had accelerated approval only, all had full approval. Comparing the 542 analyzed trial arms with ORR available to the 332 eligible trial arms without ORR available, there were several statistically significant differences (eTable 1 in the Supplement), including a greater number of approved regimens with ORR reported and a greater proportion of phase 3 trials with ORR reported, likely representing a reporting bias favoring positive trial results.

Of the 294 regimens studied, 212 (72%) were used in a single trial arm; therefore, the mean ORR was equal to the reported ORR, and the maximum ORR statistically exceeded was equal to the lower bound of the 95% CI. Of the remaining 82 regimens, 65 were studied in 2 to 5 trial arms and 17 in 6 or more trial arms. Figure 1 presents an example of forest plots from 3 commonly studied NSCLC regimens, including the reported ORR and CIs for each trial arm; also presented are the calculations for mean ORR and maximum ORR statistically exceeded for each regimen. In some instances, the maximum ORR was much higher than the mean ORR due to trials in select populations (eg, patients with certain tumor genotypes), resulting in a higher ORR.

Relationship to Regulatory Approval

There was a statistically significant relationship between increased ORR and increased likelihood of regulatory approval across all regimens (τ = 0.27, P < .001 for maximum ORR; τ = 0.12, P = .01 for mean ORR) (Figure 2A and B, respectively). This relationship was relatively linear, with the highest likelihood of regulatory approval seen in regimens with a high maximum ORR. Studying single-agent regimens and combination regimens separately, important differences can be seen. For single-agent regimens, the relationship is stronger (τ = 0.49, P < .001 for maximum ORR; τ = 0.41, P < .001 for mean ORR) (Figure 2C and D, respectively) and there appears to be a late plateau; this is seen most strikingly with maximum ORR, in which the likelihood of regulatory approval increases in a linear fashion until an ORR of 45% and then levels off at a high likelihood. The relationship is less strong for combination regimens (τ = 0.28, P < .001 for maximum ORR; τ = 0.17, P = .003 for mean ORR) (Figure 2E and F, respectively) and an early plateau is seen, with likelihood of regulatory approval increasing more steeply above an ORR of 30% but never achieving the high likelihood seen with single-agent regimens.

We studied the diagnostic characteristics of maximum ORR and mean ORR as metrics for predicting regulatory approval, asking whether it is better to develop a drug that is broadly active in all populations studied or highly active in just a subset of trials. Studying single-agent regimens and combination regimens separately, receiver operating characteristic curves for both metrics demonstrated a high area under the curve (eFigure 2 in the Supplement). For both types of regimens, maximum ORR performed better as a diagnostic tool than mean ORR (P = .004 for single agents and P < .001 for combinations). This highlights that a drug development strategy can be successful even if only 1 of numerous trials is able to demonstrate a high ORR.

To assist in future single-arm trial design, we evaluated a series of candidate ORRs (in 5% intervals from 20% to 60%), each of which could be used as a statistical aim (or null hypothesis) in the development of a “breakthrough therapy.” We then studied each of these as a diagnostic test both for single-agent regimens and combination regimens, calculating the sensitivity and specificity when applied to maximum ORR exceeded. As shown in the Table, specificity and positive predictive value (PPV) reached 98% and 89% for single-agent regimens statistically exceeding an ORR of 30%, and were both equal to 100% for single agents statistically exceeding an ORR of 45%. These high-ORR regimens included some agents studied using a biomarker enrichment strategy (eg, crizotinib in NSCLC) and some not (eg, axitinib in RCC) (eTable 2 in the Supplement). In contrast, no predictive ORR threshold was identified for combination regimens, with a maximum PPV of 47% to 60%. Sensitivity was generally poor across all thresholds for both types of regimens, highlighting that high ORR is not a sensitive metric for identifying all regimens that can achieve regulatory approval.

Discussion

Scientific advances in the past decade have led to the discovery of targeted anticancer agents with unprecedented single-agent activity in select cancer subpopulations. The importance of testing these new drugs in optimal patient populations has forced investigators to design more efficient trials. Phase 1 trials, traditionally used to study toxicity and pharmacokinetics, are increasingly doubling as phase 2 trials to demonstrate efficacy. In one recent example, the first-in-humans study of osimertinib, a novel epidermal growth factor receptor (EGFR) kinase inhibitor, studied patients with NSCLC with resistance to EGFR kinase inhibitors, a setting in which trials have historically been unsuccessful.21 Yet osimertinib induced responses in the first dosing cohort (20 mg)—a result made feasible through strong preclinical rationale and a biomarker-focused trial design.22,23 As investigators design trials aimed at rapid drug development, often in enriched populations, single-arm studies (sometimes performed as expansion cohorts) will and should play an increasingly important role. However, effective statistical design of such single-arm studies is not well established and is often overlooked.13

In this analysis of 542 trial arms across 4 different advanced solid tumors, we find that historical precedent supports statistically exceeding a high ORR as an appropriate trial end point for highly active single-agent therapies. The rate of regulatory approval for single-agent regimens was 89% for those with a maximum ORR statistically exceeding 30% and was 100% for those with a maximum ORR statistically exceeding 45%. The highest ORR seen with a single agent not receiving regulatory approval was 54% (95% CI, 43%-65%) with dacomitinib in a phase 2 trial in NSCLC,24 a disease for which several other similar agents (gefitinib, erlotinib, afatinib dimaleate, icotinib) are available. Other single-agent regimens exhibiting a relatively high ORR but not achieving regulatory approval included vascular endothelial growth factor inhibitors in RCC, a space in which multiple therapies already exist, and heat shock protein 90 inhibitors, a class of drugs that have moderate toxic effects and no established biomarker predicting for activity.

Whereas high ORR has a high PPV for single-agent regimens, high ORR is not a sensitive test for identification of all approved agents. Half of the single agents in this series with regulatory approval had no trial with an ORR that statistically exceeded 20%. This is in part an inherent limitation of our focus on ORR, which captures just 1 characteristic of a drug without considering others that would be considered during the regulatory approval process such as preclinical and mechanistic data, toxicity data, and data on long-term outcomes. Some drugs exhibiting a lower ORR can be very active, potentially harboring cytostatic anticancer activity not fully captured by RECIST response assessment. One recent example is pembrolizumab, which received accelerated approval for treatment of advanced melanoma after showing an ORR of 24% in 89 patients (95% CI, 15%-34%)25; whereas the ORR was not extremely high and the CI was broad, the favorable toxicity profile and prolonged duration of response supported the approval decision. In general, however, drugs with a lower ORR will be more likely to need randomized trials to achieve regulatory approval.

For combination regimens, the relationship between ORR and regulatory approval was different from that for single agents. A key challenge in the development of combination therapies is the ability to isolate the effect of an individual component of the combination, which can be difficult using a single-arm trial. Supporting this point, we found that combination therapies achieving high ORRs often do not receive regulatory approval. Whereas a high ORR may be adequate for regulatory approval of a single-agent therapy, a combination therapy is more likely to need randomized development to demonstrate how the addition of a second agent improves a time-to-event end point such as PFS or OS.

Our data lead us to consider an alternate design for some single-arm trials, somewhat different from the phase 2 design made ubiquitous by Simon in 1989.26 In this landmark article, Simon describes phase 2 trials intended to “determine whether [a therapy] has sufficient biologic activity… to warrant more extensive development,”26(p1) a setting in which he suggests that statistically exceeding a null hypothesis of 5% to 10% ORR is an appropriate end point. This is, for example, the statistical approach used by the National Cancer Institute’s MATCH trial, in which basket trials of targeted therapies enroll cancers harboring rare genotypes, treating approximately 30 patients to rule out a null hypothesis of 5% ORR.27 One challenge with this approach is that if a high ORR is identified, a confirmatory trial is still needed for regulatory approval. This is the challenge facing crizotinib for ROS1-rearranged NSCLC—even with a 72% ORR (95% CI, 58%-84%) in 50 patients enrolled to an expansion cohort of the phase 1 crizotinib trial,28 the path to full FDA approval would historically require a randomized trial demonstrating improved PFS, likely infeasible in this rare genotype–defined disease. Our results point to an alternate single-arm study design for this setting, aimed at confirming breakthrough activity rather than screening for biologic activity. Such a trial could treat just 30 patients with ROS1-rearranged NSCLC and, with a true ORR of 70%, would have 86% power to rule out a null hypothesis of 45% ORR. Given the historical precedent that agents with such a high ORR routinely achieve regulatory approval and demonstrate improved PFS or OS, proving a high ORR in such a hypothesis-driven trial has the potential to be an efficient alternative for drug development in rare cancer populations.

There are limitations to our analysis. While we studied an unbiased cohort of registered trials within a 4-year period, this database omitted older and newer trials. Many of those older trials would provide important context regarding the development of established regimens, but these are unavailable in an unbiased fashion before ClinicalTrials.gov registration was mandated in 2007. However, by including control arms from larger randomized studies, our analysis included data on many older regimens. Additionally, newer trials of targeted therapies could have provided greater insight into the kinds of ORR that are associated with regulatory approval, yet many newer therapies such as immune checkpoint inhibitors have immature trial data and still are undergoing regulatory submission or have received more limited, accelerated approval. Therefore, we believe that the cohort of trials studied is representative of the overall population of systemic anticancer therapies. We did restrict this analysis to common and easily measured solid tumors; further study will be needed before these data can be applied to such cancers as breast cancer, prostate cancer, and ovarian cancer, which can have a substantial burden of disease that is more difficult to measure on computed tomography.

Conclusions

We present a large and unbiased analysis of the relationship between tumor response and regulatory approval. For single-agent regimens, maximum ORR statistically exceeding 30% to 45% was associated with regulatory approval, with positive predictive values in the range of 89% to 100%. This suggests that high ORR is an appropriate statistical end point for single-arm trials aiming to demonstrate breakthrough activity of a single agent.

Back to top
Article Information

Accepted for Publication: December 16, 2015.

Corresponding Author: Lawrence H. Schwartz, MD, Department of Radiology, Columbia University College of Physicians and Surgeon, 180 Ft Washington Ave, Ste 320, New York, NY 10032 (lschwartz@columbia.edu).

Published Online: February 25, 2016. doi:10.1001/jamaoncol.2015.6315.

Author Contributions: Drs Oxnard and Schwartz had access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Oxnard, Schwartz.

Acquisition, analysis, or interpretation of data: Oxnard, Wilcox, Gonen, Polotsky, Hirsch, Schwartz.

Drafting of the manuscript: Oxnard, Wilcox, Polotsky, Schwartz.

Critical revision of the manuscript for important intellectual content: Oxnard, Gonen, Hirsch, Schwartz.

Statistical analysis: Gonen, Polotsky.

Obtained funding: Oxnard.

Administrative, technical, or material support: Hirsch, Schwartz.

Study supervision: Oxnard, Schwartz.

Conflict of Interest Disclosures: Dr Oxnard has received consulting fees from AstraZeneca, Ariad, Clovis Oncology and honoraria from AstraZeneca, Boehringer Ingelheim, and Chugai. Dr Hirsch has received consulting fees from Pfizer, Astellas, Genentech, Sandoz, and Hospira. Dr Schwartz has received consulting fees from Novartis, Celgene, Icon, and Bioclinica. No other disclosures are reported.

Funding/Support: This research was supported in part by the Stading-Younger Cancer Research Foundation and the National Cancer Institute of the National Institutes of Health (U01-CA140207).

Role of the Funder/Sponsor: These funding sources had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

References
1.
Kwak  EL, Bang  Y-J, Camidge  DR,  et al.  Anaplastic lymphoma kinase inhibition in non-small-cell lung cancer.  N Engl J Med. 2010;363(18):1693-1703.PubMedGoogle ScholarCrossref
2.
Flaherty  KT, Puzanov  I, Kim  KB,  et al.  Inhibition of mutated, activated BRAF in metastatic melanoma.  N Engl J Med. 2010;363(9):809-819.PubMedGoogle ScholarCrossref
3.
Byrd  JC, O’Brien  S, James  DF.  Ibrutinib in relapsed chronic lymphocytic leukemia.  N Engl J Med. 2013;369(13):1278-1279.PubMedGoogle Scholar
4.
Sherman  RE, Li  J, Shapley  S, Robb  M, Woodcock  J.  Expediting drug development—the FDA’s new “breakthrough therapy” designation.  N Engl J Med. 2013;369(20):1877-1880.PubMedGoogle ScholarCrossref
5.
Hay  M, Thomas  DW, Craighead  JL, Economides  C, Rosenthal  J.  Clinical development success rates for investigational drugs.  Nat Biotechnol. 2014;32(1):40-51.PubMedGoogle ScholarCrossref
6.
Lipson  D, Capelletti  M, Yelensky  R,  et al.  Identification of new ALK and RET gene fusions from colorectal and lung cancer biopsies.  Nat Med. 2012;18(3):382-384.PubMedGoogle ScholarCrossref
7.
Beadling  C, Jacobson-Dunlop  E, Hodi  FS,  et al.  KIT gene mutations and copy number in melanoma subtypes.  Clin Cancer Res. 2008;14(21):6821-6828.PubMedGoogle ScholarCrossref
8.
Kwok  M, Foster  T, Steinberg  M.  Expedited programs for serious conditions: an update on breakthrough therapy designation.  Clin Ther. 2015;37(9):2104-2120.PubMedGoogle ScholarCrossref
9.
Cavallo  J. A look ahead: how the FDA is adapting in the era of precision medicine. ASCO Post. 2013;4(17). http://www.ascopost.com/issues/november-1,-2013/a-look-ahead-how-the-fda-is-adapting-in-the-era-of-precision-medicine.aspx. Accessed February 3, 2016.
10.
Kesselheim  AS, Wang  B, Franklin  JM, Darrow  JJ.  Trends in utilization of FDA expedited drug development and approval programs, 1987-2014: cohort study.  BMJ. 2015;351:h4633.PubMedGoogle ScholarCrossref
11.
Lamont  EB, Hayreh  D, Pickett  KE,  et al.  Is patient travel distance associated with survival on phase II clinical trials in oncology?  J Natl Cancer Inst. 2003;95(18):1370-1375.PubMedGoogle ScholarCrossref
12.
Oxnard  GR, Morris  MJ, Hodi  FS,  et al.  When progressive disease does not mean treatment failure: reconsidering the criteria for progression.  J Natl Cancer Inst. 2012;104(20):1534-1541.PubMedGoogle ScholarCrossref
13.
Dahlberg  SE, Shapiro  GI, Clark  JW, Johnson  BE.  Evaluation of statistical designs in phase I expansion cohorts: the Dana-Farber/Harvard Cancer Center experience.  J Natl Cancer Inst. 2014;106(7):dju163.PubMedGoogle ScholarCrossref
14.
Blumenthal  GM, Karuri  SW, Zhang  H,  et al.  Overall response rate, progression-free survival, and overall survival with targeted and standard therapies in advanced non-small-cell lung cancer: US Food and Drug Administration trial-level and patient-level analyses.  J Clin Oncol. 2015;33(9):1008-1014.PubMedGoogle ScholarCrossref
15.
Hirsch  BR, Califf  RM, Cheng  SK,  et al.  Characteristics of oncology clinical trials: insights from a systematic analysis of ClinicalTrials.gov.  JAMA Intern Med. 2013;173(11):972-979.PubMedGoogle ScholarCrossref
16.
Therasse  P, Arbuck  SG, Eisenhauer  EA,  et al.  New guidelines to evaluate the response to treatment in solid tumors.  J Natl Cancer Inst. 2000;92(3):205-216.PubMedGoogle ScholarCrossref
17.
Eisenhauer  EA, Therasse  P, Bogaerts  J,  et al.  New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1).  Eur J Cancer. 2009;45(2):228-247.PubMedGoogle ScholarCrossref
18.
Bowman  AW, Azzalini  A.  Computational aspects of nonparametric smoothing with illustrations from the sm library.  Comput Stat Data Anal. 2003;42(4):545-560.Google ScholarCrossref
19.
Kruskal  WH.  Ordinal measures of association.  J A Stat Assoc. 1958;53(284):814-861.Google ScholarCrossref
20.
DeLong  ER, DeLong  DM, Clarke-Pearson  DL.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.  Biometrics. 1988;44(3):837-845.PubMedGoogle ScholarCrossref
21.
Oxnard  GR.  Strategies for overcoming acquired resistance to epidermal growth factor receptor: targeted therapies in lung cancer.  Arch Pathol Lab Med. 2012;136(10):1205-1209.PubMedGoogle ScholarCrossref
22.
Jänne  PA, Yang  JC-H, Kim  D-W,  et al.  AZD9291 in EGFR inhibitor-resistant non-small-cell lung cancer.  N Engl J Med. 2015;372(18):1689-1699.PubMedGoogle ScholarCrossref
23.
Cross  DAE, Ashton  SE, Ghiorghiu  S,  et al.  AZD9291, an irreversible EGFR TKI, overcomes T790M-mediated resistance to EGFR inhibitors in lung cancer.  Cancer Discov. 2014;4(9):1046-1061.PubMedGoogle ScholarCrossref
24.
Jänne  PA, Ou  S-HI, Kim  D-W,  et al.  Dacomitinib as first-line treatment in patients with clinically or molecularly selected advanced non-small-cell lung cancer: a multicentre, open-label, phase 2 trial.  Lancet Oncol. 2014;15(13):1433-1441.PubMedGoogle ScholarCrossref
25.
Rajakulendran  T, Adam  DN.  Spotlight on pembrolizumab in the treatment of advanced melanoma.  Drug Des Devel Ther. 2015;9:2883-2886.PubMedGoogle Scholar
26.
Simon  R.  Optimal two-stage designs for phase II clinical trials.  Control Clin Trials. 1989;10(1):1-10.PubMedGoogle ScholarCrossref
27.
Molecular Analysis for Therapy Choice (NCI MATCH). 2015. http://dctd.cancer.gov/MajorInitiatives/NCI-MATCH.pdf. Accessed November 10, 2015.
28.
Shaw  AT, Ou  S-HI, Bang  Y-J,  et al.  Crizotinib in ROS1-rearranged non-small-cell lung cancer.  N Engl J Med. 2014;371(21):1963-1971.PubMedGoogle ScholarCrossref
×