Correlation Between Surrogate End Points and Overall Survival in a Multi-institutional Clinicogenomic Cohort of Patients With Non–Small Cell Lung or Colorectal Cancer

Key Points Question What surrogate end point for capturing worsening disease is most correlated with overall survival (OS) in large linked clinicogenomic data sets? Findings In this cohort study of patients with non–small cell lung cancer or colorectal cancer who initiated systemic therapy for advanced disease, progression-free survival based on both radiologist and medical oncologist assessment was more consistently correlated with OS than other candidate end points, including time to treatment discontinuation and time to next treatment. Meaning This study suggests that, based on its correlation with OS, progression-free survival based on both radiologist and medical oncologist assessment may be an optimal surrogate end point for analysis of observational clinicogenomic data in cancer research.


Introduction
For patients with cancer enrolled in clinical trials, the Response Evaluation Criteria in Solid Tumors, version 1.1, 1 are applied to ascertain treatment response and disease progression.However, outside of clinical trials, clinical outcomes are typically recorded only in the unstructured text of radiology reports and clinician progress notes. 2 This can present challenges to the reproducibility of cancer research that incorporates large quantities of molecular and genomic data now routinely generated across institutions. 3though overall survival (OS) constitutes a key outcome in cancer research, end points such as progression-free survival (PFS; time to progression or death), time to treatment discontinuation (TTD), and time to next treatment (TTNT) may also be relevant in some contexts.Progression can be assessed earlier than mortality, and PFS is less associated with subsequent lines of therapy, which can confound ascertainment of the association of any individual treatment with OS.However, the process of extracting PFS outcomes from electronic health record (EHR) data at scale is resource intensive, requiring review of thousands of clinical documents.Measurement of PFS in observational contexts is also not standardized regarding the definition of cancer progression.In contrast, TTD, defined as the time from initiation of a systemic therapy regimen to the date of treatment discontinuation or death, can be rapidly extracted from structured pharmacy data and has therefore been proposed as an alternative end point in the observational setting. 4,5Time to next treatment, defined as time from initiation of a systemic therapy to the date of initiation of the first subsequent systemic therapy regimen [6][7][8] or death, can similarly be extracted from pharmacy data.
The utility of end points such as PFS, TTD, or TTNT, particularly when they are applied in lieu of, or as a surrogate for, OS, may be evaluated by measuring the correlation between the end point and OS. 9 For example, disease-free survival has become an accepted surrogate end point in adjuvant therapy clinical trials for colorectal cancer (CRC) owing to its high correlation with OS. 10,11 Understanding the correlation between PFS measures and OS, and other pragmatic end points (TTD or TTNT) and OS, specifically in the observational context may have increasing implications given the growing role of real-world evidence for regulatory purposes. 12A structured framework is necessary to define progression outcomes using such real-world data.The PRISSMM (pathology, radiology and imaging, documentation of signs and symptoms, medical oncologist notes, and tumor markers) framework consists of directives for abstracting clinical outcomes from individual components of the medical record. 13 This system fosters transparency and reproducibility by defining outcome measures with specific attention to data provenance.To specifically inform research derived from linked clinical and genomic data, this study focused on a multi-institutional cohort of patients with advanced non-small cell lung cancer (NSCLC) or CRC whose tumor sequencing results were submitted to the American Association for Cancer Research's Project GENIE (Genomics Evidence Neoplasia Information Exchange). 3The specific objective of this analysis was to report correlations between OS and TTD, TTNT, and PFS, as systematically ascertained from the EHR using the PRISSMM framework for medical record curation.

Cohort
The cohort for this analysis included patients with stage I to stage IV NSCLC or CRC whose tumors underwent genomic sequencing at Dana-Farber Cancer Institute, Memorial Sloan Kettering Cancer Center, Princess Margaret Cancer Centre, or the Vanderbilt-Ingram Cancer Center between January 1, 2014, and December 31, 2017.Patients consented to medical record review and genomic profiling of their tumor tissue at each institution; this supplemental retrospective analysis was approved by the Dana-Farber/Harvard Cancer Center institutional review board under a waiver of informed consent because this study presented no more than minimal risk to the participants.Outcomes were analyzed for patients who were either diagnosed initially with stage IV NSCLC or CRC and received at

Medical Record Curation
Curation of imaging reports and medical oncologist notes was performed according to the PRISSMM framework. 13For each imaging report and medical oncologist assessment, curators were asked to record whether the radiologist or clinician described the presence of cancer, and if so, whether the cancer was improving and responding, stable, mixed, or worsening or progressing.Curators reviewed the text of radiologists' reports for computed tomography, magnetic resonance imaging, positron emission tomography, and nuclear medicine evaluations.Curators reviewed the first note per month from a medical oncologist; if one was not available, a note from an advanced practice clinician (nurse practitioner or physician assistant) was reviewed.

Outcomes Time to Treatment Discontinuation
The index date for TTD was defined as the start date of the first systemic therapy regimen, consisting of a drug or group of drugs for which there was documentation from the treating oncologist of a plan for simultaneous administration for recurrent or metastatic disease.Dates of initiation and final administration of each drug in infusional regimens were curated.For NSCLC, the most common firstline infusional regimens were given every 3 weeks, so to more closely capture the point when a decision was reached to discontinue the regimen, the end date of infusional regimens was defined for this analysis as 3 weeks after the last drug in the regimen was administered. 15For CRC, the most common first-line infusional regimens were given every 2 weeks, so the end date for infusional regimens was defined as 2 weeks after the last drug in the regimen was administered.For oral therapy, the end date was defined as the date on which the prescription expired or the medical oncologist documented discontinuation of the regimen, whichever came first.In primary analyses, death also constituted a treatment discontinuation event.Censoring was performed on the date patients were last known alive and receiving treatment.

Time to Next Treatment
In primary analyses, TTNT was defined as time from first-line treatment start to initiation of subsequent systemic therapy or death.Censoring was performed at the date patients were last known alive and free of subsequent therapy.

Progression-Free Survival
Four definitions of PRISSMM-derived PFS outcomes were evaluated: PFS-imaging (PFS-I; time to first worsening or progression documented in imaging report, or death), PFS-medical oncologist (PFS-M; time to first worsening or progression documented in medical oncologist assessment, or death), PFS-I-or-M (time to first indication of worsening or progression in imaging report or medical oncologist assessment, or death, whichever was earliest), and PFS-I-and-M (time from treatment start to worsening or progression having been documented in both an imaging report and a medical oncologist assessment, or death).The index date for PFS was defined as the start date of first-line therapy for recurrent or metastatic disease.Patients were censored on the date last known alive and free of disease progression.

Statistical Analysis End Point Correlations
Analyses were conducted on January 5, 2021.Overall survival, TTD, TTNT, and PFS measures were estimated using the Kaplan-Meier method.Correlations and 95% CIs between OS and (1) TTD, ( TTNT, or (3) each PRISSMM-derived PFS outcome were measured using normal scores rank correlation, calculated using the iterative multiple imputation approach for analysis of correlations between 2 partially censored failure times. 16Among patients with NSCLC, correlations between OS and alternative end points were further explored by systemic therapy regimen category, including (1)   all regimens, (2) cytotoxic chemotherapy only (with or without an anti-vascular endothelial growth factor agent), (3) checkpoint inhibitor immunotherapy only, or (4) oral targeted therapy.Among patients with CRC, only the analysis of all regimens was performed, because most regimens in that cohort were chemotherapy based.
In sensitivity analyses, correlations were recalculated after restricting to patients who underwent genomic testing prior to starting first-line therapy.Because all patients in this cohort had genomic testing as an inclusion criterion, this procedure restricted calculations to the time during which patients were at risk for death, and it was performed in lieu of left truncation, which the statistical package used for correlation calculations could not incorporate.In a second set of sensitivity analyses, patients who died without experiencing disease progression or a treatment discontinuation or change event were excluded from the denominator to assess the extent to which correlations between candidate outcomes and OS were owing to mortality rather than progression or treatment discontinuation events.Calculations were performed using R, version 3.6.1 (R Group for Statistical Computing) and the SurvCorr R package, version 1.0. 17All P values were from 2-sided tests and results were deemed statistically significant at P < .05.

Patient-Level End Point Correlations After First-Line Therapy Non-Small Cell Lung Cancer
The median OS after initiation of any first-line therapy was 28.9 months (interquartile range [IQR], 12.0-66.6months) (Table 2); in a sensitivity analysis restricted to 375 patients starting therapy after their genomic testing report, median OS was 22.8 months (IQR, 9.5-55.4months) (eTable 1 in Supplement 1).Among all 1161 patients, TTD yielded the shortest median time to event, at 3.6 months (IQR, 1.6-8.5 months), while PFS-I-and-M yielded the longest median time to event at 9.6 months (IQR, 4.5-20.8months) (Table 2).
In a second sensitivity analysis in which patients who died before experiencing another outcome event were excluded from correlation calculations, death in the absence of progression within this cohort was uncommon, even for PFS-I-and-M, which by definition has the longest time to event among the candidate PFS measures (ie, 23 of 1161 patients [2.0%]) and for TTNT, which could be impacted by high rates of death without receiving a subsequent line of therapy (25 of 1161 patients [2.2%]).Correlation coefficients were similar to the primary analysis (eTable 3 in Supplement 1).

Colorectal Cancer
The median OS after initiation of any first-line therapy for patients with CRC was 42.0 months (IQR, 22.8-83.8months) (Table 2); in a sensitivity analysis restricted to 160 patients starting therapy after their genomic testing report to remove time when patients were not at risk for death, it was not ).Among all 1150 patients, TTD yielded the shortest median time to event, at 4.3 months (IQR, 2.3-6.5 months), while PFS-I-and-M yielded the longest median time to event at 14.4 months (IQR, 8.6-28.9months) (Table 2).
Patterns of correlation between end points for CRC were similar to those in the NSCLC cohort.
In a second sensitivity analysis in which patients who died before experiencing another outcome event were excluded from correlation calculations, death in the absence of progression remained uncommon, even for PFS-I-and-M (ie, 8 of 1150 patients [0.7%]) and for TTNT (7 of 1150 patients [0.6%]).Correlation coefficients were similar to the primary analysis (eTable 3 in Supplement 1).

Discussion
In this multi-institutional analysis of candidate clinical end points among patients undergoing systemic therapy for advanced, genomically characterized NSCLC or CRC, TTD was poorly correlated with OS and TTNT was modestly correlated with OS.Progression-free survival estimated based on abstraction of both imaging and medical oncologist notes (PFS-I-and-M) was most consistently correlated with OS.The magnitude of these patient-level correlations between alternative end points and OS was similar to that observed in analyses of clinical trial data 4 and an observational cohort 8,18 of patients with NSCLC, although the examination of outcomes derived from specific components of the health record across cancer types was a novel feature of the present study.Surrogate outcomes may be useful in observational data sets even when OS data are available, particularly when researchers study a specific line of therapy in contexts in which survival through multiple lines of therapy is common.The consistency of the correlation between PFS-I-and-M and OS in this analysis implies that when a surrogate outcome in clinicogenomic analyses for patients with NSCLC or CRC is required, PFS-I-and-M may be an optimal way to define real-world PFS.
Although this analysis focused on a clinicogenomic data set, it has implications for analysis of real-world evidence in observational cancer research in general.Time to treatment discontnuation and TTNT are attractive end points because they can often be measured with minimal manual review of medical records using structured pharmacy records.In contrast, PFS-I and PFS-M require review of radiology reports and oncologist notes to identify evidence of progression.These results suggest that, in some contexts, a consistent framework for abstraction of such clinical end points, with attention to data provenance 13 -whether progression is documented on imaging results, by a clinician, or both-may be required for observational cancer research, rather than relying on TTD or TTNT.This is a substantial challenge because the careful manual abstraction of EHR data is very resource intensive.One solution is to develop validated natural language processing methods for extracting these end points from the EHR 19 ; the feasibility of such approaches for both imaging reports 20,21 and clinician notes 22,23 has been previously demonstrated.
Correlations between surrogate end points and OS likely varied by treatment modality owing to several clinical factors.For example, planned discontinuation of therapy after a given number of cycles, prior to clinical progression, could partially account for a poor correlation between TTD and OS among patients receiving chemotherapy.An extreme example of this dynamic would include therapy delivered once, without immediate plans for regular administration of the drug, as might occur when floxuridine is used as local therapy for liver metastases in patients with CRC. 24In addition, discontinuation of treatment for toxic effects may further diminish the correlation between TTD and OS if toxic effects are less associated with mortality risk than is progressive disease.This dynamic may explain the modest correlation between TTNT and OS as well because treatment discontinued for toxic effects may be followed by initiation of other treatment prior to progression.
On the other hand, for patients with NSCLC, TTD and OS may have been correlated for patients receiving immunotherapy in this cohort if single-agent checkpoint inhibitor treatment had a low rate of severe adverse events 18,25 and therefore less early discontinuation.In this analysis, the patientlevel correlation between TTD and OS among patients receiving oral targeted therapy for NSCLC was higher than that among patients receiving chemotherapy, again potentially owing to lower severe adverse event rates with targeted therapy, 26,27 but PFS-I-and-M remained numerically most correlated with OS after oral targeted therapy as well.

Strengths and Limitations
This study has some strengths, including its multi-institutional cohort of patients with 2 types of common solid tumors who had linked clinical and genomic data.In such data sets, thorough evaluation of clinical end points is particularly critical, because the data are rich enough to enable researchers to ask a wide variety of questions about the association between genomic markers, treatment exposures, and outcomes.This study also has some limitations, including forms of selection bias.Contributing institutions were 4 North American academic centers, which may not be representative of institutions where patients receive care in the community or in other parts of the world.These patients were also selected for tumor genomic profiling, such that, particularly for NSCLC, those whose tumors may be less likely to harbor targetable mutations may have been underrepresented.
Patients do not always receive all their care within 1 institution or health system.Data on outcome events ascertained outside the primary academic centers for this analysis could potentially have been less complete than data for internally ascertained outcomes.In addition, patients who are included in a clinicogenomic cohort, by definition, very rarely have tumor genomic profiling performed after death.This can introduce lead time between diagnosis and genomic sequencing during which patients included in the data set could not have died, inflating survival estimates.
Applying methods for handling left truncation-entering patients into the risk set of a time-to-event analysis only after they have undergone genomic profiling 28 -is a solution to this problem, but measuring correlations between left-truncated time-to-event outcomes requires the development of novel statistical methods.This is especially a challenge if tumor genotyping is performed selectively at the point of clinical progression, which can result in informative left truncation, or temporal selection bias, in which patients selectively enter the cohort specifically because they are at increasing risk of a poor outcome. 29Nevertheless, these results were similar in sensitivity analyses that were restricted to patients receiving treatments initiated after genomic testing, such that all patients were in the risk set at the index time point.In addition, in observational data sets, the frequency of clinical end point ascertainment is based on the judgment of a patient's clinician and may itself be associated with clinical risk over time; the association of this phenomenon with outcome estimates requires further study.
Despite the relatively large size of the cohorts analyzed in this study, there were overlapping 95% CIs for some comparisons, particularly between PFS-I-and-M and PFS-M for both NSCLC and CRC, and by category of systemic therapy, in which each category had a more limited sample size.An even larger study would be needed to formally compare the correlations between these 2 particular end points and OS and evaluate the generalizability of all observed correlations to additional cancer types and categories of treatment, because the validity of surrogate end points may be specific to disease contexts 30 and clinical questions. 31More broadly, substantial literature exists on approaches to evaluating the validity of surrogate end points in the context of clinical trials, 31,32 focused largely on measuring correlations between treatment effects.Many observational research questions, particularly in the clinicogenomic context, are focused less on treatment effect comparisons than on biomarker associations.More robust frameworks for evaluating surrogate end points in the observational context 33 are needed.

Conclusions
In this observational cohort of patients with genomically profiled advanced NSCLC or CRC, TTD was inconsistently correlated, and TTNT moderately correlated, with OS on a per-patient basis.
Progression-free survival based on review of both imaging reports and medical oncologist notes was most correlated with OS among patients with either type of cancer.Although TTD and TTNT are straightforward to calculate owing to the availability of structured pharmacy data in the EHR, these results indicate that researchers should be cautious about applying such end points if a surrogate outcome consistently correlated with OS is required.In such contexts, PFS-I-and-M may be an optimal choice.

Figure 2 .
Figure 2. Kaplan-Meier Curves for Candidate Outcome Measures After First-Line Systemic Therapy for Recurrent or Metastatic Colorectal Cancer

JAMA Network Open | Oncology Surrogate
14d Points and Overall Survival Among Patients With NSCLC or Colorectal Cancer therapy regimen, or who initially received a diagnosis of stage I to stage III NSCLC or CRC and received at least 1 systemic therapy regimen that began at least 6 months after initial diagnosis (assumed to represent treatment for recurrent disease).Patients were followed up through August 31, 2020 (NSCLC), and October 31, 2020 (CRC).This study is reported according to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline.14 JAMA Network Open.2021;4(7):e2117547.doi:10.1001/jamanetworkopen.2021.17547(Reprinted) July 26, 2021 2/12 Downloaded From: https://jamanetwork.com/ on 09/30/2023 least 1 systemic

Table 2 .
Correlations Between Clinical End Points and OS by Therapy Regimen Category progression; PFS-I-or-M, PFS based on either imaging or medical oncologist ascertainment, whichever came first; PFS-M, PFS based on medical oncologist ascertainment only; TTD, time to treatment discontinuation (or death); TTNT, time to next treatment (or death).