The only data required to perform the adjustment are Cox HRs with CIs and survival probabilities excerpted from Kaplan-Meier survival curves. Cox-TEL indicates Cox proportional hazards–Taylor expansion adjustment for long-term survival data.
Data are from 285 patients with surgically resected melanoma within an 8-year follow-up period. IFN indicates interferon alfa-2b.
eMethods. Details for the proposed Cox-TEL adjustment method.
eSimulations. Detailed simulation settings and results.
eDiscussion. Selection of t′js for survival probabilities for Cox-TEL and detailed discussion for limitations of Cox-TEL.
eFigure 1. Survival curves of the two treatment arms in the four scenarios.
eFigure 2. Box plots for estimated HRs with sample sizes n0 and n1 in the four scenarios.
eFigure 3. Box plots for estimated HRs with sample sizes 2n0 and 2n1 in the four scenarios.
eFigure 4. Box plots for estimated DPs with sample sizes n0 and n1 in the four scenarios.
eFigure 5. Box plots for estimated DPs with sample sizes 2n0 and 2n1 in the four scenarios.
eFigure 6. Probability of rejecting HR = 1 in the four scenarios.
eFigure 7. Probability of rejecting DP = 0 in the four scenarios.
eTable 1. Treatment effects in the four scenarios for short-term survivors and long-term survivors in both arms.
eTable 2. Hazard ratios between short-term survivors and differences in proportions of long-term survivors with sample sizes n0 and n1.
eTable 3. Hazard ratios between short-term survivors and differences in proportions of long-term survivors with sample sizes 2n0 and 2n1.
eTable 4. Performance of the proposed method when π1 = 0.6, 0.5, 0.4, 0.3 (the proportion of long-term survivors in arm 1 increases) and π0 = 0.9 with all other parameters as in scenario 1 and sample sizes n0 and n1.
eTable 5. Ratios of the distance between the largest observation time and the largest uncensored observation time to the largest uncensored observation time across different median survival settings.
Customize your JAMA Network experience by selecting one or more topics from the list below.
Identify all potential conflicts of interest that might be relevant to your comment.
Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.
Err on the side of full disclosure.
If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.
Not all submitted comments are published. Please see our commenting policy for details.
Hsu C, Lin EP, Shyr Y. Development and Evaluation of a Method to Correct Misinterpretation of Clinical Trial Results With Long-term Survival. JAMA Oncol. 2021;7(7):1041–1044. doi:10.1001/jamaoncol.2021.0289
How can inappropriate Cox hazard ratios (HRs) in immune checkpoint inhibitor trials be converted to appropriate proportional hazards (PH) cure model treatment-effect estimates (HR for short-term survivors and difference in proportions [DP] for long-term survivors) to provide better guidance for clinical decision-making?
The proposed Cox-TEL (Cox PH–Taylor expansion adjustment for long-term survival data) method was applied to simulated data, which showed that Cox-TEL–converted values (defined as Cox-TEL HR and Cox-TEL DP) were close to PH cure model estimates. The accuracy of Cox-TEL was further verified in a real-world melanoma data set.
Cox HRs may lead to misinterpretation of drug efficacy in immune checkpoint inhibitor trials; the Cox-TEL method can convert inappropriate Cox HRs to appropriate treatment-effect estimates using published trials results as inputs, which may have enormous influence in clinical decision-making.
In immune checkpoint inhibitor (ICI) trials, long tails and crossovers in survival curves—which violate the proportional hazards (PH) assumption—are commonly observed, making cure or restricted mean survival time models preferable for analysis of ICI survival data. Cox PH analysis, however, still appears in major medical journals, leading to potential misinterpretation of clinical significance.
To convert inappropriate Cox hazard ratios (HRs) to appropriate PH cure model treatment-effect estimates (HR for short-term survivors and difference in proportions [DP] for long-term survivors) for more accurate interpretation of published ICI trials.
Design and Setting
This study uses the Taylor expansion technique to demonstrate the mathematical relationship between Cox PH and PH cure models for data with long-term survival, and based on this relationship, proposes the Cox-TEL (Cox PH–Taylor expansion adjustment for long-term survival data) adjustment method. The proposed Cox-TEL method requires only 2 inputs: the reported Cox HRs and Kaplan-Meier–estimated survival probabilities.
Comprehensive simulations show the strength of the proposed method in terms of power, bias, and type I error rate; these results, which are close to PH cure model estimates, were further verified in a melanoma data set (N = 285; Cox HR = 0.71; 95% CI, 0.51-0.91; Cox-TEL HR = 0.83; 95% CI, 0.60-1.07; PH cure HR = 0.86; 95% CI, 0.61-1.11; Cox-TEL DP = 0.10; 95% CI, 0.01-0.23; PH cure DP = 0.10; 95% CI, 0.00-0.21). The magnitude of potential difference between reported and adjusted HRs using real-world ICI trial results is demonstrated. For example, in the CheckMate 067 trial (nivolumab/ipilimumab combination therapy vs ipilimumab), the Cox HR was 0.54 (95% CI, 0.44-0.67), and the Cox-TEL HR was 0.90 (95% CI, 0.73-1.11).
Conclusions and Relevance
The findings of this study suggest the need to revisit published ICI survival data analysis to address potential misinterpretation. The Cox-TEL method not only is designed for this purpose, but also is user friendly and easy to implement using published clinical trial data and a freely available R software package.
The Kaplan-Meier (KM) estimator and the Cox proportional hazards (PH) model are standard methods for survival analysis in oncology drug development. Prior to the introduction of immune checkpoint inhibitor (ICI) therapy, with achievable long-term survival among patients with advanced-stage cancers, these 2 methods together worked well for clinical trial outcome interpretation. In ICI trials, however, long tails and crossovers in survival curves may violate the PH assumption, making Cox PH less appropriate in the context of comparing ICIs with other therapies.
Several statistical models have been proposed to better analyze data with a long tail in the KM survival curve, for example, restricted mean survival time1-3 and cure models.4 Restricted mean survival time—the difference in mean survival times between study arms within a restricted window—provides an alternative to hazard ratio (HR) when the PH assumption is violated. On the other hand, cure models, including the PH cure model,5-9 consider population survival as a mixture of patients without long-term survival (short-term survivors [STS]) and patients in the long-tail segment of the survival curve (long-term survivors [LTS]). Cure models not only consider survival probabilities among STS (ie, HR), but also evaluate and compare proportions of LTS between arms (ie, difference in proportions [DP]). Therefore, cure models seem ideal for analyzing ICI trial data but have not been widely adopted, resulting in a critical need for an adjustment method able to convert inappropriate Cox HRs to approximate PH cure model treatment-effect estimates (ie, HR and DP) for better interpretation of ICI trial results.
We propose an adjustment method, Cox-TEL (Cox PH–Taylor expansion adjustment for long-term survival data), which converts inappropriate Cox HRs to appropriate treatment-effect estimates based on the mathematical relationship between Cox PH and PH cure models. The only data required to perform the adjustment are Cox HRs with CIs and survival probabilities excerpted from KM curves, which are often made available in the published literature. The Vanderbilt University Medical Center Institutional Review Board determined that review was not required because the study does not qualify as human subject research per 45 CFR §46.102(e)(1). Informed consent was not needed because the study used publicly available, deidentified data from published articles.
The PH cure model assumes that the study population has 2 patient groups, namely, STS and LTS; STS eventually experience an event, while LTS do not. Based on this assumption, STS and LTS are evaluated separately, with HRs calculated for STS and DPs for LTS. If LTS are not observed in a study population, the PH cure model collapses to Cox PH. A long plateau in the KM curve suggests the existence of an LTS group; thus, visual assessment provides a simple but informal method for considering LTS. More formal methodology is described in eDiscussion in the Supplement.
The proposed Cox-TEL method generates an adjustment factor, determined by Taylor polynomials, to convert Cox HR to Cox-TEL HR and DP, with corresponding CIs; these metrics are approximations of PH cure HR and DP, respectively. Figure 1 illustrates the Cox-TEL schema, with computation details in eMethods in the Supplement. The statistical software package described in this article was developed in R, version 3.6.1 (R Foundation).
We used simulated data and real-world data from published ICI trials10-12 to evaluate the proposed method. In 4 simulation scenarios (eFigure 1 and eTable 1 with details in the eSimulations in the Supplement), Cox-TEL HRs and DPs approximate those computed with the PH cure model (eFigures 2 through 7 and eTables 2 and 3 in the Supplement). The following are real-world data illustrations.
To examine performance of Cox-TEL in real-world data, we first considered relapse-free survival data13 from 285 patients with surgically resected melanoma randomized to treatment with adjuvant high-dose interferon alfa-2b or best supportive care (Eastern Cooperative Oncology Group trial EST 1684).10 The KM survival curve suggested better relapse-free survival in the treatment arm with a Cox HR of 0.71 (95% CI, 0.51-0.91) within an 8-year follow-up period (Figure 2). However, the long survival curve tails suggested violation of the PH assumption.
To address this issue, we applied the Cox-TEL adjustment with the following survival probabilities excerpted from KM survival curves: at t = 2, 4, 6, 8 (years), control arm, 0.36, 0.28, 0.26, 0.25; treatment arm, 0.48, 0.39, 0.35, 0.35; with height of plateau, 0.25 and 0.35, respectively. Details for time point (t′js) selection are included in eDiscussion in the Supplement.
The Cox-TEL HR for STS was 0.83 (95% CI, 0.60-1.07), and DP for LTS was 0.10 (95% CI, 0.01-0.23). The 95% CI of the Cox-TEL HR crossed 1; thus, STS survival difference between arms would not be considered statistically significant. On the other hand, interferon alfa-2b treatment showed a higher proportion of LTS (35%) compared with control (25%). To assess performance of Cox-TEL on this real-world data, we applied PH cure (smcure13), which computed an STS HR of 0.86 (95% CI, 0.61-1.11) and an LTS DP of 0.10 (95% CI, 0.00-0.21)—estimates very close to the Cox-TEL results, suggesting the reliability of the proposed method.
For real-world performance as well as illustration of potential misinterpretation with Cox HR in ICI studies, we next considered CheckMate 017/05711 and CheckMate 06712 findings before and after Cox-TEL adjustment (Table). In CheckMate 017/057,11 the Cox HR for OS was 0.70 (95% CI, 0.61-0.81), indicating superiority of nivolumab over docetaxel. In the KM curves, however, an early crossover was seen, suggesting patients without benefit from nivolumab treatment. Consistent with this visual assessment, we computed a Cox-TEL HR of 0.89 (95% CI, 0.77-1.03) and DP of 0.10 (95% CI, 0.05-0.15) (Table), findings that indicated no survival advantage in STS and small (though statistically significant) effect size in LTS.
The CheckMate 067 trial12 compared nivolumab/ipilimumab combination therapy vs ipilimumab, or nivolumab alone vs ipilimumab, in advanced melanoma. Cox HRs were 0.54 (95% CI, 0.44-0.67) and 0.65 (95% CI, 0.53-0.79), respectively. These results suggested that either combination therapy or nivolumab alone is superior to ipilimumab alone. After adjustment, however, neither nivolumab (Cox-TEL HR = 0.94; 95% CI, 0.77-1.15) nor nivolumab/ipilimumab combination (Cox-TEL HR = 0.90; 95% CI, 0.73-1.11) showed superiority to ipilimumab monotherapy for STS. For LTS, Cox-TEL DP was larger in the combination setting (0.25; 95% CI, 0.15-0.35) than with monotherapy (0.19; 95% CI, 0.09-0.29).
In both CheckMate data sets,11,12 the larger the magnitude of Cox-TEL DP, the smaller the Cox HR. This inverse association may reflect contribution of this specific LTS population to Cox HR, which in turn leads to result misinterpretation.
This study proposes a useful adjustment method, Cox-TEL, which converts inappropriate Cox HRs to Cox-TEL HRs and Cox-TEL DPs—treatment-effect estimates that approximate the PH cure model and are better suited to interpret ICI trial results. When LTS proportion exceeds 0.5, Cox-TEL adjustment may produce biased approximations; moreover, when working with raw data, PH cure models may be considered before Cox-TEL. Without raw data, however, we provide a simple way to check the PH assumption for STS survival (see eDiscussion in the Supplement). Although trials powered for Cox PH can be underpowered for the PH cure model,14 and therefore for Cox-TEL adjustment, the proposed adjustment method nevertheless offers insight into appropriate model selection and data interpretation for clinicians, filling an important gap for oncologists before cure models are adopted in ICI study design and data analysis.
Performance of Cox-TEL depends on the proportion of LTS. Cox-TEL HR gradually underestimates true HR when this proportion increases, and this unsatisfactory result further affects the CI estimates for HR and DP (eTable 4 in the Supplement). Large bias results when (1) Cox HR is far from the true HR, particularly when due to a large proportion of LTS, or (2) use of low-order Taylor polynomials creates poor approximation. Using higher-order Taylor polynomials can reduce bias of Cox-TEL HR relative to true HR, and the algorithm provides automatic selection of polynomial order to minimize bias. See eDiscussion in the Supplement for details.
As presented in this study, when long-tail segments were observed in KM survival curves (eTable 5 in the Supplement for the judgment of long tails), the data structure violated the PH assumption; thus, a cure model may be a better method for data analysis. Cox-TEL converts Cox HRs to approximated PH cure model treatment-effect estimates based on a ready-to-use R package and 2 required inputs: survival probability excerpted from KM curves and Cox HRs. Information obtained from Cox-TEL adjustment provides oncologists an opportunity to rethink ICI trial results, which may have clinical influence before cure models are widely used in ICI trial analyses.
Accepted for Publication: January 11, 2021.
Published Online: April 15, 2021. doi:10.1001/jamaoncol.2021.0289
Corresponding Author: Yu Shyr, PhD, Department of Biostatistics, Vanderbilt University Medical Center, 2525 W End Ave, Ste 1100, Nashville, TN 37203 (firstname.lastname@example.org).
Author Contributions: Dr Shyr had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: All authors.
Acquisition, analysis, or interpretation of data: Lin, Hsu.
Drafting of the manuscript: Lin, Hsu.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Shyr, Hsu.
Obtained funding: Shyr.
Administrative, technical, or material support: Shyr, Hsu.
Conflict of Interest Disclosures: Dr Shyr reported receiving grants from the National Institutes of Health/National Cancer Institute during the conduct of the study and grants from the National Institutes of Health outside the submitted work. Dr Lin reported receiving research grants from the Ministry of Science and Technology in Taiwan and the National Health Research Institute in Taiwan during the conduct of the study. No other disclosures were reported.
Funding/Support: This work was supported by the National Institutes of Health (P30CA068485, U24CA163056, U24CA213274, P50CA236733, P50CA098131, U54CA163072), Ministry of Science and Technology Taiwan (MOST107-2314-B-002-231, MOST108-2314-B-002-197-MY2), and National Health Research Institute Taiwan (NHRI-EX109-10937BC).
Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Disclaimer: Dr Shyr is Associate Editor for Statistics of JAMA Oncology but was not involved in any of the decisions regarding review of the manuscript or its acceptance.
Additional Contributions: We thank Lynne Berry, PhD (Center for Quantitative Sciences, Department of Biostatistics, Vanderbilt University Medical Center), for critical review and editing of the manuscript. She was not compensated for this work.
Additional Information: The Cox-TEL adjustment method is built on the mathematical foundation of the Taylor expansion. The method can be implemented through an R package (available from the corresponding author on request), with published trial data as inputs.