Among the 10 trials in the primary analysis, the treatment effect sizes were greater for OS than for PFS.
eTable. Primary and Secondary Analysis Excluding and Including the 2 RCTs That Tested PD-1 Inhibitors in the Post-ipilimumab Setting
eFigure 1. Selection of Randomized Controlled Trials (RCTs) Included in the Analysis
eFigure 2. Correlation Between Median Overall Survival (OS) and Median Progression-Free Survival (PFS) in the RCTs of PD-1 Inhibitors
eFigure 3. Correlation Between Gain in Median Overall Survival and Gain in Median Progression-Free Survival in the RCTs of PD-1 Inhibitors
eFigure 4. Correlation Between Hazard Ratio (HR) of OS and HR of PFS in the RCTs of PD-1 Inhibitors
Customize your JAMA Network experience by selecting one or more topics from the list below.
Identify all potential conflicts of interest that might be relevant to your comment.
Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.
Err on the side of full disclosure.
If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.
Not all submitted comments are published. Please see our commenting policy for details.
Gyawali B, Hey SP, Kesselheim AS. A Comparison of Response Patterns for Progression-Free Survival and Overall Survival Following Treatment for Cancer With PD-1 Inhibitors: A Meta-analysis of Correlation and Differences in Effect Sizes . JAMA Netw Open. 2018;1(2):e180416. doi:10.1001/jamanetworkopen.2018.0416
Does the treatment effect size differ between overall survival and progression-free survival for PD-1 (programmed cell death 1) inhibitors used in patients with advanced solid tumors, and are overall survival and progression-free survival correlated?
This meta-analysis of 12 randomized clinical trials found no significant correlation between overall survival and progression-free survival in terms of medians and gains in medians, but their hazard ratios were significantly correlated. The protective effects of treatment were greater for overall survival than for progression-free survival.
Progression-free survival cannot adequately capture the benefit of PD-1 inhibitors; thus, overall survival should remain the gold standard end point for trials of PD-1 inhibitors.
Based on efficacy results from pivotal randomized clinical trials, PD-1 (programmed cell death 1) inhibitors, such as nivolumab and pembrolizumab, have been approved to treat various cancers. Response patterns with varying effects on progression-free survival (PFS) and overall survival (OS) have been reported for these drugs.
To compare 2 outcomes for PD-1 inhibitors: the correlation between PFS and OS and the differences in treatment effect size between PFS and OS.
A systematic search of PubMed, Google Scholar, the Cochrane Library, Web of Science, and conference abstracts for randomized clinical trials of nivolumab and pembrolizumab published in English.
Randomized clinical trials of nivolumab or pembrolizumab in adults with solid-tissue cancers with a nonimmunotherapy control.
Data Extraction and Synthesis
Two reviewers screened the studies for selection and extracted data on medians and hazard ratios (HRs) for PFS and OS. A pooled meta-analysis was conducted.
Main Outcomes and Measures
Across all trials, correlation coefficients between median PFS and median OS and between PFS benefit and OS benefit as well as the HRs of PFS and OS were assessed. The difference in treatment effect sizes between PFS and OS was assessed using a ratio of HRs (rHR). Subgroup analyses were conducted to observe differences based on drug, tumor type, and timing of therapy.
Ten randomized clinical trials that included 4653 patients and met inclusion criteria were identified, as were 2 others (comprising 764 patients) in which nivolumab or pembrolizumab was used following treatment with ipilimumab. The correlations between median PFS and median OS (r = 0.676; R2 = 0.457; P = .09) and the correlations between the change in PFS and the change in OS (r = 0.474; R2 = 0.225; P = .28) were not significant. However, the correlation between HRs of PFS and OS was significant (r = 0.637; R2 = 0.406; P = .048). Using random-effects meta-analysis, the protective effects of treatment were greater for OS than for PFS (pooled rHR, 1.18; 95% CI, 1.06-1.31; P = .002). There was no statistical evidence for heterogeneity across the studies (Q = 6.24; P = .72; I2 = 0%). Subgroup analyses showed some differences in the treatment effect sizes based on drug type, tumor type, and line of therapy.
Conclusions and Relevance
There was no significant correlation between OS and PFS in terms of medians and gains in medians, but their HRs were significantly correlated. The protective effects of treatment were greater for OS than for PFS. Traditional Response Evaluation Criteria in Solid Tumors–based PFS cannot capture the benefit of PD-1 inhibitors in patients with solid tumors, and OS should remain the gold standard.
The most important clinical outcome that can be observed among new cancer drugs is an improvement in overall survival (OS) when compared with current therapies. However, improvements in OS can take time to recognize and can be contaminated by crossover or the effects of postprogression therapies. As a result, progression-free survival (PFS) is often used as a surrogate for OS. But PFS has also been criticized as unreliable in some circumstances, as progression is defined as an increase in tumor size beyond an arbitrary cutoff and is prone to bias, particularly when the investigators are not blinded.1
When PFS strongly correlates with OS, PFS can be a useful and valid surrogate measure for evaluating a new therapy’s clinical effectiveness. However, this correlation has been shown to vary across treatment settings. For example, a 2015 systematic review showed that for most tumor types, there was only a weak correlation between anticancer drug–related changes in PFS and OS.2 A recent meta-analysis of targeted anticancer therapies showed that the drugs had a greater effect on PFS than OS, but that PFS benefits often did not translate to OS benefits.3
Two PD-1 (programmed cell death 1) inhibitor antibodies, nivolumab (Opdivo) and pembrolizumab (Keytruda), have been approved by the US Food and Drug Administration (FDA) based on efficacy in treating certain types of solid tumors, including advanced melanoma, lung cancer, renal cell cancer, urothelial cancer, Hodgkin-type lymphoma, head and neck cancer, and hepatocellular cancer. These PD-1 inhibitors show unconventional patterns of response, including long duration of responses, responses after initial progression (pseudoprogression), and even responses after discontinuation of therapy.4,5 This atypical response pattern is also observed in the crossing over of PFS curves in some randomized clinical trials (RCTs) of PD-1 inhibitors.6
The correlation between PFS and OS has not yet been formally studied across PD-1 inhibitors. To help guide future PD-1 research, we conducted a correlation and meta-analytic study of RCTs with PD-1 inhibitors in adult patients with solid tumors and evaluated the differences in treatment effect sizes between PFS and OS.
We conducted a systematic search of PubMed, the Cochrane Library, Web of Science, and Google Scholar for all RCTs of nivolumab and pembrolizumab in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) reporting guideline. We used the search terms “nivolumab” or “opdivo” or “pembrolizumab” or “keytruda” or “pd-1” and limited our search to RCTs and findings published in English. The first search was performed on May 8, 2017, and reinforced on July 4, 2017. Relevant conference abstracts were also searched for updated data. After title and abstract screening by 2 independent investigators, the full texts of potentially relevant studies were downloaded and reviewed for the following exclusion criteria: (1) not an RCT, (2) not reporting data for PFS and OS, and (3) not reporting original data. Because the aim of our study was to evaluate the correlation between PFS and OS and the difference in treatment effect sizes between PFS and OS in solid tumors in adults, we also excluded studies involving pediatric patients, patients with hematological malignancies, RCTs comparing combinations of immunotherapies (eg, nivolumab plus ipilimumab), and RCTs that involved immunotherapy control groups (eg, ipilimumab control) (eFigure 1 in the Supplement).
Although we retained RCTs evaluating nivolumab and pembrolizumab as any line of therapy, for our primary analysis we excluded trials that tested these PD-1 inhibitors in populations that had already received another checkpoint inhibitor, such as CTLA-4 (cytotoxic T-lymphocyte–associated protein 4) or PD-L1 (programmed cell death 1 ligand 1) inhibitors, because the effect on PFS or OS could be a residual effect from the previous therapy. These studies were included as a secondary analysis.
This study was not submitted for institutional review board approval because it did not involve individual patient information and all data extractions were made from publicly available published articles. Data were independently extracted from published reports by 2 of us (B.G. and S.P.H.), with any discrepancy resolved through consensus of all authors. We collected key trial characteristics: treatment setting, primary end point, sample size, and details of the treatment and control regimens. For trial outcomes, we extracted the median PFS, median OS, and hazard ratios (HRs) with confidence intervals for PFS and OS.
We graded the quality of each trial using the 5-point Jadad scale for rating RCTs,7 with scores of 4 and 5 determined a priori to represent high-quality trials.
The primary end point of this study was the correlation and difference in treatment effect sizes between PFS and OS. We defined PFS as the time from randomization to first documented tumor progression or death from any cause and OS as the time from randomization to the date of death due to any cause. We used the Response Evaluation Criteria in Solid Tumors (RECIST), in which progression is defined as at least a 20% increase in the sum of diameters of target lesions, taking as reference the smallest sum on study plus an absolute increase of at least 5 mm or the appearance of any new lesions.
The correlation between median PFS and OS was assessed using the Pearson correlation coefficient. The PFS benefit was defined as median PFS of the PD-1 inhibitor group minus that of the control group, while OS benefit was defined as median OS of the PD-1 inhibitor group minus that of the control group. The correlation between the PFS benefit and the OS benefit and that between HR of PFS and HR of OS were also assessed using the Pearson correlation coefficient. All correlation studies were performed using SPSS statistical software version 22.0 (IBM) and Stata statistical software version 15 (StataCorp), and pooled meta-analyses were performed using R statistical software version 3.2.3 (R Project for Statistical Computing). We also conducted linear regression analysis to quantify the benefit in OS for a given magnitude of benefit in PFS whenever the correlation was significant.
To study the difference in treatment effect sizes between the PFS and OS, we used the ratio of HRs (rHR).3 The rHR is defined as the ratio of the HR of PFS to the HR of OS. The summary rHR across the studies was obtained by pooling the individual rHRs of each study using a random-effects model to account for the heterogeneous group of patient populations. An rHR of less than 1 would indicate that the treatment effects of PD-1 inhibitors were larger for PFS than OS, while an rHR greater than 1 would indicate that the treatment effects were larger for OS than PFS. A treatment with an rHR of less than 1 improves (benefits) PFS more than it improves OS, while a treatment with an rHR greater than 1 improves OS more than it improves PFS.
Heterogeneity among studies was assessed using the Cochrane Q statistic (assumption of homogeneity was considered invalid for values of P < .10) and quantified using an I2 test. Subgroup analyses were prespecified and included drug (nivolumab or pembrolizumab), tumor type, and line of use (first line vs others). Two-sided P < .05 was the threshold for statistical significance.
Our search revealed 1825 potentially relevant reports, of which 12 trials8-21 fulfilled our eligibility criteria (eFigure 1 in the Supplement). Of these 12, 2 tested nivolumab or pembrolizumab after treatment with ipilimumab, and because of the potential for residual effects of ipilimumab, these 2 trials were not considered for primary analysis but were included as a secondary analysis.
Table 1 presents the basic characteristics of our main sample, which included 6 trials of nivolumab8-13 and 4 trials of pembrolizumab.14-17 One RCT was a phase 2 trial (Keynote 021)17 and the rest were phase 3. Four studies were conducted with the immunotherapy as first-line treatment and 6 with the immunotherapy as second-line treatment after chemotherapy or targeted therapy. Nine studies tested a PD-1 inhibitor as a single agent, while the Keynote 021 study tested pembrolizumab plus chemotherapy vs chemotherapy alone.17 Non–small cell lung cancer was the most common tumor type studied (6 RCTs [50% of the cohort]). There was 1 RCT each in melanoma, head and neck cancer, renal cell cancer, and urothelial cancer.
Nivolumab was tested at a dose of 3 mg/kg every 2 weeks in each trial. Pembrolizumab was tested at either 2 mg/kg or 10 mg/kg or a fixed dose of 200 mg every 3 weeks. Keynote 010 and Keynote 002 trials used 2 different pembrolizumab groups of 2 mg/kg and 10 mg/kg. However, because the current FDA-approved dose of 200 mg is closer to 2 mg/kg than 10 mg/kg, we used data for the 2-mg/kg group in our study.
Traditional RECIST 1.1 criteria were used to define progression in all the included trials. The primary end points were OS in 5 RCTs (all involving nivolumab), PFS in 2 RCTs (1 nivolumab and 1 pembrolizumab), and both OS and PFS as coprimary end points in 2 RCTs of pembrolizumab (Table 1). The 1 phase 2 pembrolizumab trial, Keynote 021, used a primary end point of objective response rate.17 Because this trial was the only phase 2 trial, the only one with response rate as an end point, and the only trial in which a PD-1 inhibitor was not tested as a single agent, we performed all rHR analyses with and without including this trial.
Most of the studies in our sample had Jadad scores of only 2 or 3 (Table 1). Only 1 study was double blind (Checkmate 066 [Jadad score 4]), while the others were open label. Trials with low Jadad scores often did not describe the methods of randomization adequately. All the studies that had PFS or response rate as their primary or coprimary end points, which require some measure of subjective judgment to assess tumor growth, were open label. All the studies reported data on median PFS and HRs of PFS and OS; however, the median OS was not yet reached for 3 trials (Table 2).
In the 10 RCTs included in the primary meta-analysis, 4653 participants were randomized (PD-1 cohort: 2387; control: 2266). The majority of participants (2995 [64%]) were from nivolumab trials. The secondary analysis included 5417 participants from 12 RCTs (PD-1 cohort: 2839; control cohort: 2578). All trials restricted enrollment to patients with an Eastern Cooperative Oncology Group performance status score of 0 or 1 (on a scale of 0-5, with 0 indicating a patient is fully active and able to carry on activity without restriction and 5 indicating death), except for Keynote 045,14 which enrolled patients with a performance status of up to 2, and Checkmate 025,12 which enrolled patients with Karnofsky performance status score of 70 or above (on a scale of 0-100, with 0 indicating death and 100 indicating patient has normal activity with no symptoms).
Data for median PFS were available for all studies. Data for median OS were not available for 3 studies (Table 2). Thus, the correlation between PFS and OS could only be obtained from 7 RCTs. The correlation between median OS and median PFS was not significant (n = 7; r = 0.676; R2 = 0.457; P = .09) (eFigure 2 in the Supplement). The gain in OS also did not correlate with the gain in PFS (n = 7; r = 0.474; R2 = 0.225; P = .28) (eFigure 3 in the Supplement). However, there was significant correlation between the HRs of PFS and OS (n = 10; r = 0.637; R2 = 0.406; P = .048) (eFigure 4 in the Supplement).
All the trials in our sample reported HRs for both PFS and OS. The HRs for OS were statistically significant for 8 trials (80%), whereas the HRs for PFS were statistically significant in only 4 trials (40%) (Table 2). In 5 trials (50%), there was benefit seen for OS without any PFS benefit. In only 1 trial, PFS benefit occurred without any benefit in OS. This was the Keynote 021 trial.17
Among the 10 trials in the primary analysis, the treatment effect sizes were greater for OS than for PFS (pooled rHR, 1.18; 95% CI, 1.06-1.31; P = .002) (Figure). Because all the HRs for OS and the HRs for PFS in the individual trials were on the same side of null, a pooled rHR of 1.18 means the PD-1 inhibitors have a protective effect (improvement) on OS and that the HRs for PFS on average were 18% more than the HRs for OS. The included studies were statistically nonheterogeneous (Q = 6.24; P = .72; I2 = 0%). When excluding the Keynote 021 study, the rHR was 1.19 (95% CI, 1.07-1.32; P = .001).
The treatment effect sizes were greater for OS than for PFS for both nivolumab and pembrolizumab, but missed statistical significance for pembrolizumab (nivolumab rHR, 1.18; 95% CI, 1.03-1.34; P = .01 vs pembrolizumab rHR, 1.18; 95% CI, 0.98-1.41; P = .07). There was no heterogeneity among the nivolumab studies (I2 = 0%) but some heterogeneity among the pembrolizumab studies (I2 = 39.6%). However, when the outlier pembrolizumab trial was excluded, the effect size on OS was greater than for PFS for pembrolizumab (rHR, 1.21; 95% CI, 1.01-1.45).
The treatment effect sizes between PFS and OS for RCTs of non–small cell lung cancer trials were similar (rHR, 1.14; 95% CI, 0.99-1.31), whereas for the other cancer types, there was a greater effect on OS than on PFS (rHR, 1.23; 95% CI, 1.05-1.44). The observed treatment effect was also greater on OS than on PFS for trials conducted using the drug as a second-line or later treatment (rHR, 1.24; 95% CI, 1.10-1.40), but not for those trials conducted in the first-line setting (rHR, 0.99; 95% CI, 0.79-1.24) (Table 3).
The 2 RCTs that were conducted in patients who had already been treated with ipilimumab were Keynote 002, which was a phase 2 trial of pembrolizumab,18,19 and Checkmate 037,20 which was a phase 3 trial of nivolumab. Both were conducted in advanced melanoma. As secondary analysis, we also tested the rHR by including these 2 RCTs and found that the effect sizes for PFS and OS were similar for the combined analysis of these 2 trials as well as in the overall population when they were included (eTable in the Supplement).
In this study, which to our knowledge is the first to evaluate the correlation and differences in treatment effect sizes between PFS and OS in PD-1 inhibitor trials, we found that OS was poorly correlated with PFS with respect to both medians and absolute gains. However, unlike with targeted agents or chemotherapies, this was not because improvement in PFS benefit did not translate to OS. Rather, OS benefits were observed without any apparent benefit in PFS. Indeed, the HRs of PFS were on average greater than the HRs of OS by as much as 18%. Also, the correlation between PFS and OS in terms of HRs was significant.
The lack of correlation between the medians and absolute difference in medians occurring with a significant correlation between the HRs of OS and PFS is not surprising for 2 reasons. First, the correlation between the medians was based on only 7 trials because the median OS was unavailable for 3 RCTs, but the HR correlation was calculated for 10 trials, increasing the power. Second, medians are not a good marker of efficacy for immuno-oncology trials, and the HR should capture the benefit better than the median.22,23 The correlation between PFS and OS will get clearer as more trials are published and a larger sample can be analyzed.
Several hypotheses might explain our findings of greater benefits in OS than PFS with PD-1 inhibitors. First, PFS in all these trials was defined using the traditional RECIST criteria, which were developed in the era before immunotherapy. It has been reported that traditional RECIST criteria fail to properly capture the concept of disease progression with immunotherapies that have atypical response patterns.6 Although immunotherapy-specific RECIST criteria have been proposed, they have not been used in trials yet.24 Failure of traditional RECIST criteria to define PFS of immunotherapies might be 1 reason for smaller benefits in PFS vs OS with the trials of PD-1 inhibitors.25
Second, because PD-1 inhibitors have residual efficacy for a longer duration, these drugs could affect OS more than PFS even after the discontinuation of treatment.4,5 Some patients experience a very durable response with immunotherapies, and thus the benefit in OS seen in the overall population could primarily be driven by extraordinary benefit in a select few patients. However, the tail effect that has been widely reported with ipilimumab has yet to be observed with PD-1 inhibitors.26
A third explanation for our findings might be pseudoprogression,27 in which the tumor first grows in size due to T-cell infiltrate before undergoing shrinkage. This phenomenon might lead the investigators to consider the response progressive disease under RECIST criteria when, in fact, the patient could respond later, ultimately leading to benefit in OS.
The subgroup analyses in our study suggested that the effect on OS vs PFS was greater for nivolumab than pembrolizumab, greater for tumor types other than non–small cell lung cancer, and greater when used in second or later lines vs first line of therapy. This is in keeping with the past observations that the crossing-over of PFS curves has been seen in nivolumab trials but not pembrolizumab, and that the phenomenon of pseudoprogression is not as common in non–small cell lung cancer as in other tumor types.28 However, these analyses should be considered hypothesis generating, rather than confirmatory, because of the small sample sizes and the lack of heterogeneity among the trials.
This study has limitations. First, it involves only RCTs with PD-1 inhibitors and is not generalizable to other checkpoint inhibitors. Second, when 2 studies conducted in patients previously treated with ipilimumab were included, the difference in treatment effect sizes between PFS and OS was no longer statistically significant. Third, most studies included in this analysis used PD-1 inhibitors as a single agent; the result when these drugs are used in combination remains unknown. Fourth, medians are not always considered an appropriate metric for assessing correlation between PFS and OS with immunotherapy drugs, and alternative metrics have been proposed but not yet adopted in trials.22,23 Furthermore, the proportion of patients responding to and receiving benefit from PD-1 inhibitors differs by tumor type and, thus, this trial-level analysis may not hold true for the individual patient. Another limitation inherent in all RCTs of immunotherapy drugs is the use of immunotherapies after progression, which can confound OS. These data were available for 5 studies, of which 4 reported significant benefit in OS. Thus, the receipt of immunotherapies after progression does not seem to affect this analysis.
Most previous analyses of the correlation between OS and PFS have focused on tumor types.2 In the case of immunotherapies, the correlation between OS and surrogate outcomes might reasonably be considered a function of the drug class rather than the tumor biology. For PD-1 inhibitors, we found that OS is not correlated with PFS measured by traditional RECIST criteria; however, the HRs were correlated. By contrast, PD-1 inhibitors may have larger effects on OS than on PFS, which would be unprecedented in oncology therapeutics. These results support the rationale of using OS as the primary end point of future phase 3 trials of PD-1 inhibitors and discourage the use of PFS as a sole primary end point, as the latter may provide misleading information about the efficacy of these drugs.
Accepted for Publication: March 24, 2018.
Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2018 Gyawali B et al. JAMA Network Open.
Corresponding Author: Bishal Gyawali, MD, PhD, Program on Regulation, Therapeutics, and Law, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, 1620 Tremont St, Ste 3030, Boston, MA 02120 (firstname.lastname@example.org).
Author Contributions: Dr Gyawali had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Gyawali.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: Gyawali.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Gyawali, Hey.
Obtained funding: Kesselheim.
Administrative, technical, or material support: Gyawali.
Conflict of Interest Disclosures: Dr Kesselheim reported grants from the Laura and John Arnold Foundation during the conduct of the study and research support from the FDA Office of Generic Drugs and Division of Health Communication outside the scope of the submitted work.
Funding/Support: The Laura and John Arnold Foundation supported the research. Dr Kesselheim also receives research support from the Harvard Program in Therapeutic Science and the Engelberg Foundation.
Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Meeting Presentation: This research was presented at the European Society for Medical Oncology Congress 2017; September 8-12, 2017; Madrid, Spain.
Additional Contributions: Jessica M. Franklin, PhD, at Brigham and Women’s Hospital/Harvard Medical School, provided statistical inputs; she received no compensation.
Create a personal account or sign in to: