[Skip to Content]
Sign In
Individual Sign In
Create an Account
Institutional Sign In
OpenAthens Shibboleth
[Skip to Content Landing]
Figure 1.
Observed 180-Day Survival From the Initiation of Palliative Chemotherapy
Observed 180-Day Survival From the Initiation of Palliative Chemotherapy

Data are stratified by decile of model-predicted mortality risk. Decile 1 denotes the highest predicted risk decile; decile 10, the lowest. Overall denotes overall mean survival among all patients, irrespective of model-predicted risk.

Figure 2.
One-Year Mortality After Chemotherapy Initiation
One-Year Mortality After Chemotherapy Initiation

A-D, Mortality data from 4 randomized clinical trials40-43 compared with the machine learning model predictions for patients in the validity sample. The randomized clinical trials compared bevacizumab and oxaliplatin,40 pemetrexed and carboplatin,41 etoposide and carboplatin,42 and carboplatin and paclitaxel.43 E-H, Mortality estimates from the National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) program compared with the machine learning model predictions. The 45° dotted lines denote equivalence of observed and estimated mortality. Orange lines and shaded 95% CIs show observed 1-year mortality against quintiles of model-predicted 30-day mortality risk. Blue lines and shaded 95% CIs show observed 1-year mortality against predictions from 1-year mortality and 95% CI for trial patients taking a given regimen and 1-year mortality estimates from the SEER program, by type, age, sex, and race/ethnicity.

Table 1.  
Patient Characteristics of Model Derivation and Validation Sets
Patient Characteristics of Model Derivation and Validation Sets
Table 2.  
Model Performance in Selected Subgroups
Model Performance in Selected Subgroups
Table 3.  
Selected Predictors by Risk Decile and Model Variance Explained
Selected Predictors by Risk Decile and Model Variance Explained
1.
Emanuel  EJ, Young-Xu  Y, Levinsky  NG, Gazelle  G, Saynina  O, Ash  AS.  Chemotherapy use among Medicare beneficiaries at the end of life.  Ann Intern Med. 2003;138(8):639-643. doi:10.7326/0003-4819-138-8-200304150-00011PubMedGoogle ScholarCrossref
2.
Earle  CC, Neville  BA, Landrum  MB, Ayanian  JZ, Block  SD, Weeks  JC.  Trends in the aggressiveness of cancer care near the end of life.  J Clin Oncol. 2004;22(2):315-321. doi:10.1200/JCO.2004.08.136PubMedGoogle ScholarCrossref
3.
Earle  CC, Landrum  MB, Souza  JM, Neville  BA, Weeks  JC, Ayanian  JZ.  Aggressiveness of cancer care near the end of life: is it a quality-of-care issue?  J Clin Oncol. 2008;26(23):3860-3866. doi:10.1200/JCO.2007.15.8253PubMedGoogle ScholarCrossref
4.
Saito  AM, Landrum  MB, Neville  BA, Ayanian  JZ, Earle  CC.  The effect on survival of continuing chemotherapy to near death.  BMC Palliat Care. 2011;10:14. doi:10.1186/1472-684X-10-14PubMedGoogle ScholarCrossref
5.
Prigerson  HG, Bao  Y, Shah  MA,  et al.  Chemotherapy use, performance status, and quality of life at the end of life.  JAMA Oncol. 2015;1(6):778-784. doi:10.1001/jamaoncol.2015.2378PubMedGoogle ScholarCrossref
6.
Schnipper  LE, Smith  TJ, Raghavan  D,  et al.  American Society of Clinical Oncology identifies five key opportunities to improve care and reduce costs: the top five list for oncology.  J Clin Oncol. 2012;30(14):1715-1724. doi:10.1200/JCO.2012.42.8375PubMedGoogle ScholarCrossref
7.
National Quality Forum. Cancer measures. https://www.qualityforum.org/News_And_Resources/Press_Releases/2012/NQF_Endorses_Cancer_Measures.aspx. August 10, 2012. Accessed July 1, 2017.
8.
Greer  JA, Pirl  WF, Jackson  VA,  et al.  Effect of early palliative care on chemotherapy use and end-of-life care in patients with metastatic non–small-cell lung cancer.  J Clin Oncol. 2012;30(4):394-400. doi:10.1200/JCO.2011.35.7996PubMedGoogle ScholarCrossref
9.
Satariano  WA, Ragland  DR.  The effect of comorbidity on 3-year survival of women with primary breast cancer.  Ann Intern Med. 1994;120(2):104-110. doi:10.7326/0003-4819-120-2-199401150-00002PubMedGoogle ScholarCrossref
10.
Hall  WH, Jani  AB, Ryu  JK, Narayan  S, Vijayakumar  S.  The impact of age and comorbidity on survival outcomes and treatment patterns in prostate cancer.  Prostate Cancer Prostatic Dis. 2005;8(1):22-30. doi:10.1038/sj.pcan.4500772PubMedGoogle ScholarCrossref
11.
Lee  L, Cheung  WY, Atkinson  E, Krzyzanowska  MK.  Impact of comorbidity on chemotherapy use and outcomes in solid tumors: a systematic review.  J Clin Oncol. 2011;29(1):106-117. doi:10.1200/JCO.2010.31.3049PubMedGoogle ScholarCrossref
12.
van Gestel  YRBM, Lemmens  VEPP, de Hingh  IHJT,  et al.  Influence of comorbidity and age on 1-, 2-, and 3-month postoperative mortality rates in gastrointestinal cancer patients.  Ann Surg Oncol. 2013;20(2):371-380. doi:10.1245/s10434-012-2663-1PubMedGoogle ScholarCrossref
13.
Sarfati  D, Koczwara  B, Jackson  C.  The impact of comorbidity on cancer and its treatment.  CA Cancer J Clin. 2016;66(4):337-350. doi:10.3322/caac.21342PubMedGoogle ScholarCrossref
14.
de Glas  NA, van de Water  W, Engelhardt  EG,  et al.  Validity of Adjuvant! online program in older patients with breast cancer: a population-based study.  Lancet Oncol. 2014;15(7):722-729. doi:10.1016/S1470-2045(14)70200-1PubMedGoogle ScholarCrossref
15.
Hoffmann  TC, Del Mar  C.  Clinicians’ expectations of the benefits and harms of treatments, screening, and tests: a systematic review.  JAMA Intern Med. 2017;177(3):407-419. doi:10.1001/jamainternmed.2016.8254PubMedGoogle ScholarCrossref
16.
Weeks  JC, Cook  EF, O’Day  SJ,  et al.  Relationship between cancer patients’ predictions of prognosis and their treatment preferences.  JAMA. 1998;279(21):1709-1714. doi:10.1001/jama.279.21.1709PubMedGoogle ScholarCrossref
17.
Rose  JH, O’Toole  EE, Dawson  NV,  et al.  Perspectives, preferences, care practices, and outcomes among older and middle-aged patients with late-stage cancer.  J Clin Oncol. 2004;22(24):4907-4917. doi:10.1200/JCO.2004.06.050PubMedGoogle ScholarCrossref
18.
Weeks  JC, Catalano  PJ, Cronin  A,  et al.  Patients’ expectations about effects of chemotherapy for advanced cancer.  N Engl J Med. 2012;367(17):1616-1625. doi:10.1056/NEJMoa1204410PubMedGoogle ScholarCrossref
19.
Temel  JS, Greer  JA, Admane  S,  et al.  Longitudinal perceptions of prognosis and goals of therapy in patients with metastatic non–small-cell lung cancer: results of a randomized study of early palliative care.  J Clin Oncol. 2011;29(17):2319-2326. doi:10.1200/JCO.2010.32.4459PubMedGoogle ScholarCrossref
20.
Glare  P, Virik  K, Jones  M,  et al.  A systematic review of physicians’ survival predictions in terminally ill cancer patients.  BMJ. 2003;327(7408):195-198. doi:10.1136/bmj.327.7408.195PubMedGoogle ScholarCrossref
21.
Stone  PC, Lund  S.  Predicting prognosis in patients with advanced cancer.  Ann Oncol. 2007;18(6):971-976. doi:10.1093/annonc/mdl343PubMedGoogle ScholarCrossref
22.
Brundage  MD, Davidson  JR, Mackillop  WJ.  Trading treatment toxicity for survival in locally advanced non–small cell lung cancer.  J Clin Oncol. 1997;15(1):330-340. doi:10.1200/JCO.1997.15.1.330PubMedGoogle ScholarCrossref
23.
Silvestri  G, Pritchard  R, Welch  HG.  Preferences for chemotherapy in patients with advanced non–small cell lung cancer: descriptive study based on scripted interviews.  BMJ. 1998;317(7161):771-775. doi:10.1136/bmj.317.7161.771PubMedGoogle ScholarCrossref
24.
Hirose  T, Yamaoka  T, Ohnishi  T,  et al.  Patient willingness to undergo chemotherapy and thoracic radiotherapy for locally advanced non–small cell lung cancer.  Psychooncology. 2009;18(5):483-489. doi:10.1002/pon.1450PubMedGoogle ScholarCrossref
25.
Keating  NL, Landrum  MB, Rogers  SO  Jr,  et al.  Physician factors associated with discussions about end-of-life care.  Cancer. 2010;116(4):998-1006. doi:10.1002/cncr.24761PubMedGoogle ScholarCrossref
26.
Keating  NL, Beth Landrum  M, Arora  NK,  et al.  Cancer patients’ roles in treatment decisions: do characteristics of the decision influence roles?  J Clin Oncol. 2010;28(28):4364-4370. doi:10.1200/JCO.2009.26.8870PubMedGoogle ScholarCrossref
27.
Liu  P-H, Landrum  MB, Weeks  JC,  et al.  Physicians’ propensity to discuss prognosis is associated with patients’ awareness of prognosis for metastatic cancers.  J Palliat Med. 2014;17(6):673-682. doi:10.1089/jpm.2013.0460PubMedGoogle ScholarCrossref
28.
National Cancer Institute Surveillance, Epidemiology, and End Results Program. Cancer statistics. https://seer.cancer.gov/statistics. Accessed July 1, 2017.
29.
Gwilliam  B, Keeley  V, Todd  C,  et al.  Development of prognosis in palliative care study (PiPS) predictor models to improve prognostication in advanced cancer: prospective cohort study.  BMJ. 2011;343:d4920. doi:10.1136/bmj.d4920PubMedGoogle ScholarCrossref
30.
Rumball-Smith  J, Shekelle  PG, Bates  DW.  Using the electronic health record to understand and minimize overuse.  JAMA. 2017;317(3):257-258. doi:10.1001/jama.2016.18609PubMedGoogle ScholarCrossref
31.
Healthcare Cost and Utilization Project (HCUP). Clinical Classifications Software for Services and Procedures. https://www.hcup-us.ahrq.gov/toolssoftware/ccs_svcsproc/ccssvcproc.jsp. Accessed January 19, 2017.
32.
Cooper  GS, Yuan  Z, Stange  KC, Amini  SB, Dennis  LK, Rimm  AA.  The utility of Medicare claims data for measuring cancer stage.  Med Care. 1999;37(7):706-711. doi:10.1097/00005650-199907000-00010PubMedGoogle ScholarCrossref
33.
Meurer  WJ, Tolles  J.  Logistic regression diagnostics: understanding how well a model predicts outcomes.  JAMA. 2017;317(10):1068-1069. doi:10.1001/jama.2016.20441PubMedGoogle ScholarCrossref
34.
DeLong  ER, DeLong  DM, Clarke-Pearson  DL.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.  Biometrics. 1988;44(3):837-845. doi:10.2307/2531595PubMedGoogle ScholarCrossref
35.
Makar  M, Ghassemi  M, Cutler  DM, Obermeyer  Z.  Short-term mortality prediction for elderly patients using Medicare claims data.  Int J Mach Learn Comput. 2015;5(3):192-197. doi:10.7763/IJMLC.2015.V5.506PubMedGoogle ScholarCrossref
36.
Goldman  L, Weinberg  M, Weisberg  M,  et al.  A computer-derived protocol to aid in the diagnosis of emergency room patients with acute chest pain.  N Engl J Med. 1982;307(10):588-596. doi:10.1056/NEJM198209023071004PubMedGoogle ScholarCrossref
37.
Chen  T, He  T. xgboost: eXtreme Gradient Boosting. https://cran.r-project.org/web/packages/xgboost/vignettes/xgboost.pdf. May 15, 2018. Accessed July 1, 2017.
38.
Shouval  R, Labopin  M, Bondi  O,  et al.  Prediction of allogeneic hematopoietic stem-cell transplantation mortality 100 days after transplantation using a machine learning algorithm: a European Group for Blood and Marrow Transplantation Acute Leukemia Working Party Retrospective Data Mining Study.  J Clin Oncol. 2015;33(28):3144-3151. doi:10.1200/JCO.2014.59.1339PubMedGoogle ScholarCrossref
39.
Gagne  JJ, Glynn  RJ, Avorn  J, Levin  R, Schneeweiss  S.  A combined comorbidity score predicted mortality in elderly patients better than existing scores.  J Clin Epidemiol. 2011;64(7):749-759. doi:10.1016/j.jclinepi.2010.10.004PubMedGoogle ScholarCrossref
40.
Saltz  LB, Clarke  S, Díaz-Rubio  E,  et al.  Bevacizumab in combination with oxaliplatin-based chemotherapy as first-line therapy in metastatic colorectal cancer: a randomized phase III study.  J Clin Oncol. 2008;26(12):2013-2019. doi:10.1200/JCO.2007.14.9930PubMedGoogle ScholarCrossref
41.
Zukin  M, Barrios  CH, Pereira  JR,  et al.  Randomized phase III trial of single-agent pemetrexed versus carboplatin and pemetrexed in patients with advanced non–small-cell lung cancer and Eastern Cooperative Oncology Group performance status of 2.  J Clin Oncol. 2013;31(23):2849-2853. doi:10.1200/JCO.2012.48.1911PubMedGoogle ScholarCrossref
42.
Noda  K, Nishiwaki  Y, Kawahara  M,  et al; Japan Clinical Oncology Group.  Irinotecan plus cisplatin compared with etoposide plus cisplatin for extensive small-cell lung cancer.  N Engl J Med. 2002;346(2):85-91. doi:10.1056/NEJMoa003034PubMedGoogle ScholarCrossref
43.
Clark  JI, Hofmeister  C, Choudhury  A,  et al.  Phase II evaluation of paclitaxel in combination with carboplatin in advanced head and neck carcinoma.  Cancer. 2001;92(9):2334-2340. doi:10.1002/1097-0142(20011101)92:9<2334::AID-CNCR1580>3.0.CO;2-3PubMedGoogle ScholarCrossref
44.
Wallington  M, Saxon  EB, Bomb  M,  et al.  30-day mortality after systemic anticancer treatment for breast and lung cancer in England: a population-based, observational study.  Lancet Oncol. 2016;17(9):1203-1216. doi:10.1016/S1470-2045(16)30383-7PubMedGoogle ScholarCrossref
45.
Moreno-Torres  JG, Raeder  T, Alaiz-Rodríguez  R, Chawla  NV, Herrera  F.  A unifying view on dataset shift in classification.  Pattern Recognit. 2012;45(1):521-530. doi:10.1016/j.patcog.2011.06.019Google ScholarCrossref
Limit 200 characters
Limit 25 characters
Conflicts of Interest Disclosure

Identify all potential conflicts of interest that might be relevant to your comment.

Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.

Err on the side of full disclosure.

If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.

Limit 140 characters
Limit 3600 characters or approximately 600 words
    Views 4,864
    Original Investigation
    Oncology
    July 27, 2018

    Development and Application of a Machine Learning Approach to Assess Short-term Mortality Risk Among Patients With Cancer Starting Chemotherapy

    Author Affiliations
    • 1Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts
    • 2Brigham and Women’s Hospital, Boston, Massachusetts
    • 3Harvard Medical School, Boston, Massachusetts
    • 4Division of Hematology and Oncology, Perelman School of Medicine, University of Philadelphia, Philadelphia, Pennsylvania
    • 5Leonard Davis Institute of Health Economics, Philadelphia, Pennsylvania
    JAMA Network Open. 2018;1(3):e180926. doi:10.1001/jamanetworkopen.2018.0926
    Key Points

    Question  Can a machine learning algorithm applied to electronic health record data predict patients’ short-term risk of death at the time that they begin chemotherapy?

    Findings  In this cohort study of 26 946 patients with cancer starting 51 774 discrete chemotherapy regimens, those at high risk of 30-day mortality were accurately identified across palliative and curative chemotherapy regimens and many types and stages of cancer. The algorithm was more accurate than predictions based on randomized clinical trials or population-based registry data.

    Meaning  A machine learning algorithm accurately identified individuals at high risk of short-term mortality and may help to guide patient and physician decisions about chemotherapy initiation and advance care planning.

    Abstract

    Importance  Patients with cancer who die soon after starting chemotherapy incur costs of treatment without the benefits. Accurately predicting mortality risk before administering chemotherapy is important, but few patient data–driven tools exist.

    Objective  To create and validate a machine learning model that predicts mortality in a general oncology cohort starting new chemotherapy, using only data available before the first day of treatment.

    Design, Setting, and Participants  This retrospective cohort study of patients at a large academic cancer center from January 1, 2004, through December 31, 2014, determined date of death by linkage to Social Security data. The model was derived using data from 2004 through 2011, and performance was measured on nonoverlapping data from 2012 through 2014. The analysis was conducted from June 1 through August 1, 2017. Participants included 26 946 patients starting 51 774 new chemotherapy regimens.

    Main Outcomes and Measures  Thirty-day mortality from the first day of a new chemotherapy regimen. Secondary outcomes included model discrimination by predicted mortality risk decile among patients receiving palliative chemotherapy, and 180-day mortality from the first day of a new chemotherapy regimen.

    Results  Among the 26 946 patients included in the analysis, mean age was 58.7 years (95% CI, 58.5-58.9 years); 61.1% were female (95% CI, 60.4%-61.9%); and 86.9% were white (95% CI, 86.4%-87.4%). Thirty-day mortality from chemotherapy start was 2.1% (95% CI, 1.9%-2.4%). Among the 9114 patients in the validation set, the most common primary cancers were breast (21.1%; 95% CI, 20.2%-21.9%), colorectal (19.3%; 95% CI, 18.5%-20.2%), and lung (18.0%; 95% CI, 17.2%-18.8%). Model predictions were accurate for all patients (area under the curve [AUC], 0.940; 95% CI, 0.930-0.951). Predictions for patients starting palliative chemotherapy (46.6% of regimens; 95% CI, 45.8%-47.3%), for whom prognosis is particularly important, remained highly accurate (AUC, 0.924; 95% CI, 0.910-0.939). To illustrate model discrimination, patients were ranked initiating palliative chemotherapy by model-predicted mortality risk, and observed mortality was calculated by risk decile. Thirty-day mortality in the highest-risk decile was 22.6% (95% CI, 19.6%-25.6%); in the lowest-risk decile, no patients died. Predictions remained accurate across all primary cancers, stages, and chemotherapies, even for clinical trial regimens that first appeared in years after the model was trained (AUC, 0.942; 95% CI, 0.882-1.000). The same model also performed well for prediction of 180-day mortality (AUC for all patients, 0.870 [95% CI, 0.862-0.877]; highest- vs lowest-risk decile mortality, 74.8% [95% CI, 72.7%-77.0%] vs 0.2% [95% CI, 0.01%-0.4%]). Predictions were more accurate than estimates from randomized clinical trials of individual chemotherapies or the Surveillance, Epidemiology, and End Results data set.

    Conclusions and Relevance  A machine learning algorithm using electronic health record data accurately predicted short-term mortality among patients starting chemotherapy. Further research is necessary to determine the generalizability and feasibility of applying this algorithm in clinical settings.

    Introduction

    Chemotherapy lowers the risk of recurrence in early-stage cancers and can improve survival and symptoms in later-stage disease. Balancing these benefits against chemotherapy’s considerable risks is challenging. Increasing evidence suggests that chemotherapy is too often started too late in the cancer disease trajectory,1-4 and many patients die soon after initiating treatment. These patients experience burdensome symptoms without many of the potential benefits of chemotherapy.5 National organizations now track the proportion of patients who die within 2 weeks of receiving chemotherapy as a marker of poor quality of care,6,7 and this number has been increasing rapidly.1,8

    A key factor underlying these trends is the difficulty of accurately identifying the risk of serious adverse events, especially death, before initiating chemotherapy. Adverse effects of chemotherapy are variable, and the influence of comorbidities is complex; thus, the risk calculus of administering chemotherapy is challenging.9-13 Cognitive biases also lead to underestimation of the risk of death,14,15 particularly in patients with metastatic cancer,16,17 who often believe that their disease is curable.18,19 Physicians do not accurately estimate prognosis in patients with cancer,20,21 and overly optimistic estimates can influence patients’ chemotherapy decisions.22-27

    To estimate mortality before initiation of chemotherapy, physicians may reference randomized clinical trial (RCT) data for individual regimens or population-level data such as the Surveillance, Epidemiology, and End Results (SEER) data set to obtain mortality risk by age, sex, and primary cancer.14,28 Although informative, these tools provide mortality estimates for broad populations of patients and often do not accurately estimate a specific individual’s mortality. Individualized decision support tools exist29 but require a substantial investment of time and resources; these tools require clinicians to collect and enter data not readily available in existing records, which limits the number of variables that can be used and adds complexity to workflows.

    There is considerable enthusiasm for the role of advanced algorithms to improve prediction; just as modern electronic health records (EHRs) pull complex data for clinicians to use in real time, algorithms could pull and process these data in parallel, presenting accurate probability forecasts to clinicians and patients.30 However, little evidence suggests that such algorithms can provide meaningful inputs to clinical decision making in cancer or elsewhere.

    New chemotherapy is a critical event in the disease trajectory of cancer, and objective predictions of short-term mortality at this time could be useful to physicians and patients in several ways. Accurate forecasts of the risks of mortality and adverse events could inform discussions of risks and benefits of chemotherapy, particularly for patients undergoing palliative chemotherapy, and could help guide important decisions regarding advance care planning and palliative care consultation. In this study, we developed and applied a machine learning algorithm to predict near-term mortality risk in a large cohort of patients with cancer starting new chemotherapy regimens.

    Methods
    Study Population

    We obtained EHR data for all patients receiving chemotherapy at the Dana-Farber/Brigham and Women’s Cancer Center (DF/BWCC), Boston, Massachusetts, from January 1, 2004, through December 31, 2014. We determined date of death by linking to the Social Security Administration’s Death Master File. We classified patients by primary cancer and presence of distant-stage disease, determined using registry data (for patients diagnosed at DF/BWCC) and International Classification of Diseases, Ninth Revision (ICD-9) codes for metastases (for patients not diagnosed at DF/BWCC or who did not have registry data and to identify progression to distant-stage disease in those previously diagnosed at DF/BWCC).31 Although diagnosis codes have limitations for determination of cancer stage, they are generally believed to provide reliable identification of the presence or the absence of distant-stage disease.32 Our study followed the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) checklist for prediction model development and validation (eMethods in the Supplement).The institutional review boards of Dana-Farber Cancer Institute and Partners HealthCare, Boston, approved this study and granted a waiver of informed consent from study participants.

    Statistical Analysis
    Outcomes

    Data were analyzed from June 1 through August 1, 2017. Our primary outcome was death within 30 days of starting new systemic chemotherapy regimens. Secondary outcomes were 30-day mortality in prespecified subgroups of interest (described later) and overall 180-day mortality. We constructed our data set at the patient–chemotherapy regimen level, such that each regimen was a new observation.

    Model Performance

    Machine learning models have the potential to overfit or produce overly optimistic estimates of model performance based on spurious correlations in development data. We thus report results only in an independent validation set, which played no role in model development; as such, overfitting would only lead to poorer model performance in the validation set. Specifically, we used data from 2004 through 2011 for model derivation and data from 2012 through 2014 for model validation. Because our data set was constructed at the patient–chemotherapy regimen level, observations describing different chemotherapy regimens in the same patient are not independent. For patients whose observations appeared before and after January 1, 2012, we randomly assigned all observations from a given patient to the derivation or the validation set, so that no patient appeared in both sets.

    Statistical Tests

    Our primary measure of model performance was the area under the receiver operating characteristic curve (AUC),33 which we calculated by comparing the mortality probability estimate from the machine learning model with observed mortality. We calculated 95% CIs of the AUC following the method of DeLong et al.34 We report AUC overall and in subgroups of clinical interest, notably age, sex, race/ethnicity, distant-stage disease, individual primary cancers, chemotherapy lines and regimens, and chemotherapy intent (palliative vs curative, identified by the treating physician and recorded as an EHR flag). To benchmark against existing prognostic models, we obtained 1-year mortality estimates from large RCTs of specific chemotherapy regimens and from the SEER program for available subgroups of patients. To give more clinically relevant metrics of predictive accuracy, we also present mortality rates in given deciles of model-predicted risk, typically highest and lowest. When presenting variable summary statistics, we report CIs for means and proportions and the first and third quartiles for medians.

    Predictors

    To transform raw EHR data into variables usable in a prediction model, we first pulled all data from the 1-year period ending the day before chemotherapy initiation (we did not drop patients based on absence of data during this period). Raw data were aggregated into 23 641 potential predictors in the following categories: demographics, prescribed medications, comorbidities and other grouped ICD-9 diagnoses, procedures,31 use of health care resources, vital signs, laboratory results, and terms derived from physician notes using natural language processing.31 For each potential predictor, we created the the statistical summary of related EHR entries for 1 month (recent) and for 2 to 12 months (baseline) before chemotherapy initiation. This strategy is outlined in more detail elsewhere.35 We also included a variable indexing how many lines of chemotherapy the patient had in total before the current regimen. No data on the current regimen itself (eg, agent, intent) were used in the predictive model. We dropped variables missing in more than 99% of the derivation sample, leaving 5390 predictors in the model.

    Algorithm

    We used gradient-boosted trees, a linear combination of decision trees similar to those used to derive many clinical decision rules to handle large sets of correlated predictors (R package: xgboost).36,37 We used 4-fold cross-validation in the development sample to choose model variables (eg, number of trees, variables per tree). The model was configured to produce individual-level probabilities of 30-day mortality. More details are available in eMethods in the Supplement.

    Missing Values

    Each split of each tree in the model (eg, a split on sex) had a default, which is the value (eg, male or female) that occurred more frequently in the training data. Observations with missing values for a given variable were assigned to the default side of the split. This process was effectively a split-specific, probabilistic imputation function that allowed us to avoid excluding observations that were missing data.

    Model Variance

    We decomposed model predictions into the linear contributions of individual variables. We calculated the (linear) sum of squares for individual variables included in the machine learning model and interpreted the residual sum of squares as the contribution of nonlinear terms and interactions used by the model. Because our model used more than 5000 predictors, we chose to report on only a small selection, specifically (1) those that most explained model variance and (2) those identified as predictors of mortality in prior studies.29,38,39 Details on our calculation of the model variance explained by individual predictors are in eMethods in the Supplement.

    Results
    Study Population

    We identified 26 946 patients who initiated 51 774 discrete chemotherapy regimens from 2004 through 2014; 59.4% had distant-stage disease. Table 1 shows patient characteristics at the time of chemotherapy initiation. Mean patient age was 58.7 years (95% CI, 58.5-58.9 years), 61.1% were female (95% CI, 60.4%-61.9%), and 86.9% were white (95% CI, 86.4%-87.4%). The most common chemotherapy regimens (derivation and validation sets) were carboplatin and paclitaxel (n = 4042), gemcitabine hydrochloride (n = 2185), and albumin-bound paclitaxel (n = 1985); 3.3% of chemotherapy regimens in the validation set (n = 523) were chemotherapy regimens that first appeared in 2012 or later and thus did not appear in the derivation set. Experimental agents not approved by the US Food and Drug Administration constituted 2.2% (n = 343) of all chemotherapy regimens in the validation set.

    Model Performance

    Among the 9114 patients in the validation set, overall 30-day mortality was 2.1% (95% CI, 1.9%-2.4%). The most common primary cancers were breast (21.1%; 95% CI, 20.2%-21.9%), colorectal (19.3%; 95% CI, 18.5%-20.2%), and lung (18.0%; 95% CI, 17.2%-18.8%). The model accurately predicted 30-day mortality for all patients, irrespective of chemotherapy intent (AUC, 0.940; 95% CI, 0.930-0.951). In the subset of patients receiving palliative chemotherapy (46.6% of regimens; 95% CI, 45.8%-47.3%), 30-day mortality was 3.1% (95% CI, 2.7%-3.5%). Prognostic estimates are likely to be particularly important for these patients, and the model also performed well for this situation, with an AUC of 0.924 (95% CI, 0.910-0.939). To illustrate the clinical implications of this accuracy, we used model predictions to individually rank patients by 30-day mortality risk, a commonly used way of stratifying risk groups.33 Thirty-day mortality in the highest decile of predicted risk for palliative-intent chemotherapy was 22.6% (95% CI, 19.6%-25.6%), whereas in the lowest-risk decile, no patients died.

    Figure 1 shows observed survival during the 180 days after palliative chemotherapy initiation by decile of model predictions (patients were followed up to 180 days). Overall 180-day mortality among all patients was 18.4% (95% CI, 17.8%-19.0%); for those initiating palliative chemotherapy, 180-day mortality was 27.9% (95% CI, 26.9%-28.9%). Model predictions on 30-day mortality were also accurate predictors of 180-day mortality (AUC, 0.827; 95% CI, 0.817-0.838); in the highest-risk decile, 180-day mortality was 74.8% (95% CI, 72.7%-77.0%) vs 0.2% (95% CI, 0.01%-0.4%) in the lowest-risk decile. Predictions were even more accurate for all patients, irrespective of chemotherapy intent (AUC, 0.870; 95% CI, 0.862-0.877); 180-day survival among these patients is shown in the eFigure in the Supplement.

    Table 2 shows model performance for predicting 30-day mortality in additional patient subgroups of interest. The model performed equally well across many kinds of primary cancers, demographic groups, and chemotherapy regimens. In distant-stage disease (mean 30-day mortality, 2.9%; 95% CI, 2.5%-3.2%), 30-day mortality in the highest-risk decile was 22.7% (95% CI, 19.9%-25.6%) vs 0 in the lowest decile (AUC, 0.924; 95% CI, 0.910-0.939). Predictions were accurate even for experimental clinical trial regimens first used from 2012 to 2014 (AUC, 0.942; 95% CI, 0.882-1.000); the derivation model was not exposed to these novel regimens in the training process.

    A key question is whether model predictions are accurate enough to be useful across a range of primary cancers, stages of disease, or lines of chemotherapy, which constitute scenarios for which prognoses vary widely. Table 2 thus also presents measures of overall predictive accuracy for first-line chemotherapy (AUC for 30-day mortality, 0.941 [95% CI, 0.925-0.956]; AUC for 180-day mortality, 0.865 [95% CI, 0.854-0.875]) compared with later lines of chemotherapy (AUC for 30-day mortality, 0.938 [95% CI, 0.924-0.952]; AUC for 180-day mortality, 0.864 [95% CI, 0.854-9.874]). eTable 1 in the Supplement presents extended results on model performance for 30- and 180-day mortality across lung, colorectal, breast, and prostate cancers by stage and line of chemotherapy.

    Comparisons With Other Prognostic Estimates

    We compared model performance with 2 external sources of mortality estimates, focusing on patients with distant-stage disease. First, we obtained mortality data from 4 RCTs of treatments for colorectal adenocarcinoma, non–small cell lung adenocarcinoma, small cell lung carcinoma, and squamous cell carcinoma of the head and neck.40-43Figure 2A-D shows observed mortality for patients in our validation sample who started specific chemotherapy regimens for which trial data are available. (We chose to show 1-year mortality because this is the only time window reported consistently in RCTs.) We compared observed mortality with 2 sources of predictions: (1) RCT data (ie, mean 1-year mortality for patients receiving the relevant chemotherapy regimen) and (2) 1-year mortality risk estimates from our model; to generate these, we calculated 1-year mortality in the derivation set for patients in each quintile of model-predicted risk (we could not use raw model predictions because these were designed to predict 30-day mortality). The overall AUC for RCT estimates was 0.555 (95% CI, 0.513-0.598) compared with 0.771 (95% CI, 0.735-0.808) for model-based estimates for these same patients.

    We also compared our model predictions of mortality with age-, sex-, race-, and cancer-specific mortality estimates from SEER, restricted to patients with advanced-stage cancers of the lung and bronchus, colon and rectum, breast, and prostate to maximize comparability in populations. Figure 2E-H shows that our model predictions (AUC, 0.810; 95% CI, 0.799-0.822) outperformed SEER estimates (AUC, 0.600; 95% CI, 0.585-0.615) for 1-year mortality. Further details on construction of RCT and SEER estimates are available in the eMethods and eTable 2 in the Supplement, and more detailed comparisons for subgroups are available in eTable 3 in the Supplement.

    Key Predictors

    Table 3 shows the distribution of key predictor variables used in the prediction model across risk deciles, as well as the proportion of model variance explained linearly by each variable. In general, key predictors of mortality identified in the literature were markedly different in the highest vs lowest model-predicted risk deciles; these predictors included summed comorbidity score,39 age,38 failure to thrive, heart rate, and certain laboratory data (eg, C-reactive protein level, white blood cell count, and alkaline phosphatase level).29 Of importance, no single variable explained more than 2% of model predictions in linear fashion. Most of the variation in the predictions (86.4%) was not a linear function of any single predictor, indicating that the tree-based model relied heavily on complex nonlinear functional forms and interactions among variables.

    Discussion

    A machine learning model based on single-center EHR data accurately estimated individual mortality risk in a cohort of patients with cancer at the time of chemotherapy initiation. The model performed well across a range of cancer types, race, sex, and other demographic variables. Mortality estimates were accurate for chemotherapy regimens with palliative and curative intent, for patients with early- and distant-stage cancer, and for patients treated with clinical trial regimens introduced in years after the model was trained. Our model outperformed estimates from RCTs and SEER data, both of which are routinely used by clinicians for quantitative risk predictions.

    This model was able to predict mortality with considerable accuracy despite lacking genetic sequencing data, cancer-specific biomarkers, or any detailed information about cancers beyond EHR data. This accuracy underscores the fact that common clinical data elements contained within an EHR (eg, symptoms, comorbidities, prescribed medications, and diagnostic tests) contain surprising amounts of signal for predicting key outcomes in patients with cancer.

    One clinically useful advantage of our algorithm is that it would not require manual input from clinicians. Current validated prognostic algorithms require considerable, often difficult input on the part of clinicians. For example, the palliative prognostic score relies on 6 weighted variables; some of these data elements, such as Karnofsky performance status, are not routinely available in the EHR and thus require manual input and calculation.21

    In contrast, our prognostic algorithm could pull directly from the EHR without manual input. Most inputs to our model are standard data elements in structured format in EHRs, including ICD-9 and procedure codes and medications. Although our algorithm was developed using a single institution’s data, its inputs are available nearly everywhere with an EHR. In addition, no special infrastructure is required to pull these data from an institution’s data warehouse; in the same way that today’s EHR systems pull a rich set of data from a database to present it to clinicians, an algorithm could pull and process the same data in real time using the processing power on a desktop computer. Although machine learning algorithms require significant computing infrastructure to construct, once derived, they can be applied using minimal computing power already available in any hospital computers running an EHR or even on a smartphone. This application facilitates potential integration into existing clinical systems. Thus, we would not anticipate major technical barriers to implementing this or similar algorithms in any organization’s clinical data to independently validate predictive power from a sample. To this end, code for our algorithm is publicly available (eResults in the Supplement and http://labsysmed.org/wp-content/uploads/2017/02/ChemoMortalityAnalysis.rtf).

    Algorithmic predictions such as ours could be useful at several points along the care continuum. They could provide accurate predictions of mortality risk to a clinician or foster shared decision making between the patient and clinician. Short-term estimates of mortality could help clinicians identify patients unlikely to benefit from chemotherapy beyond 30 days and those who may benefit from early palliative care referral, advance care planning, and prompting to get financial and family affairs in order. For patients receiving systemic chemotherapy, an estimate of 30-day mortality risk may be a useful quality indicator of avoidable treatment-associated harm.44

    Limitations

    This study has several limitations. Our model was built on data from patients treated with chemotherapy and is thus unlikely to be accurate for untreated patients. Second, our treated sample reflects the particular decisions around chemotherapy made by physicians and patients in our training data set. Patients who were eligible for chemotherapy but for some reason did not start it were not included, which could have biased the sample. However, it is likely that the direction of this bias is that prevailing treatment decisions are generally aggressive. In our sample, 62.4% of patients with distant-stage disease received chemotherapy, suggesting that physician recommendations and patient acceptance of those decisions generally lead to initiation of treatment. This finding fits with a large body of evidence suggesting that physicians in a wide range of settings overestimate survival and overuse chemotherapy. Thus, to the extent that our data set has bias, it leads to the inclusion—not exclusion—of patients who otherwise might not have received chemotherapy. As a result, we believe that this bias did not substantially distort validity. If such an algorithm were deployed in a real-world setting, periodic retraining of the model (eg, each year or quarter) would ensure that model predictions reflected contemporaneous chemotherapy decision making. This process would address changing selection into treatment over time and update the model to reflect broader changes in patient populations and chemotherapy technology.

    Several significant differences between the 2004-2011 derivation set and the 2012-2014 validation set include age at initiation, race, primary cancer, and prior chemotherapy beyond the first-line treatment. Such differences between derivation and validation sets are expected and intentional: a validation set drawn from later years of data was chosen to reflect the constant evolution of cancer epidemiology and treatment. This process made the prediction task more difficult because algorithms trained on past data cannot always perform well in the future.45 However, changes in referral patterns, chemotherapy, and diagnosis patterns are just some of the difficulties associated with algorithms in evolving real-world settings. We are reassured that performance was good despite these and other secular trends.

    Although we quantified predictive accuracy in an independent, recent validation set, the only way to truly validate such a model is prospectively. A model trained on pre-2012 data may lose accuracy as novel tumor diagnostics and therapies arise, although the accuracy of predictions for patients starting novel chemotherapies was encouraging in this regard. In addition, this study included data from a single institution. Further validation is required using cohorts from different institutions. Electronic health record data contain a multitude of biases introduced by physician behavior, institutional idiosyncrasies, and software platforms, among other limitations. These limitations can significantly affect the adaptability and relevance of our prediction model to different care settings.

    Conclusions

    Our machine learning model accurately predicted mortality risk in patients at the time of chemotherapy initiation. Although we are optimistic that accurate prognostic tools such as this could help to promote value-driven oncology care, the ideal next step would be an RCT of algorithmic estimates at the point of care. To be useful, predictive models must improve decision making in the real world. Thus, rigorous evaluation of predictions’ influence on outcomes is the criterion standard test but one that is often neglected in the literature, which focuses primarily on measuring predictive accuracy rather than real outcomes.

    Back to top
    Article Information

    Accepted for Publication: April 28, 2018.

    Published: July 27, 2018. doi:10.1001/jamanetworkopen.2018.0926

    Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2018 Elfiky AA et al. JAMA Network Open.

    Corresponding Author: Ziad Obermeyer, MD, MPhil, Brigham and Women’s Hospital, 75 Francis St, Neville House, Boston, MA 02115 (zobermeyer@bwh.harvard.edu).

    Author Contributions: Drs Elfiky and Obermeyer had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

    Concept and design: Elfiky, Pany, Obermeyer.

    Acquisition, analysis, or interpretation of data: All authors.

    Drafting of the manuscript: All authors.

    Critical revision of the manuscript for important intellectual content: All authors.

    Statistical analysis: All authors.

    Obtained funding: Elfiky, Obermeyer.

    Administrative, technical, or material support: Parikh.

    Supervision: Elfiky, Obermeyer.

    Conflict of Interest Disclosures: Dr Parikh reported personal fees from GNS Healthcare outside the submitted work. No other disclosures were reported.

    Funding/Support: This study was supported by grants DP5OD012161 from the Office of the Director and R56AG055728 from the National Institute on Aging (Dr Obermeyer), training grant T32 AG51108 from the National Institute on Aging (Mr Pany), and a grant from the Dana-Farber Cancer Institute (Dr Elfiky).

    Role of the Funder/Sponsor: The funding sources had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

    References
    1.
    Emanuel  EJ, Young-Xu  Y, Levinsky  NG, Gazelle  G, Saynina  O, Ash  AS.  Chemotherapy use among Medicare beneficiaries at the end of life.  Ann Intern Med. 2003;138(8):639-643. doi:10.7326/0003-4819-138-8-200304150-00011PubMedGoogle ScholarCrossref
    2.
    Earle  CC, Neville  BA, Landrum  MB, Ayanian  JZ, Block  SD, Weeks  JC.  Trends in the aggressiveness of cancer care near the end of life.  J Clin Oncol. 2004;22(2):315-321. doi:10.1200/JCO.2004.08.136PubMedGoogle ScholarCrossref
    3.
    Earle  CC, Landrum  MB, Souza  JM, Neville  BA, Weeks  JC, Ayanian  JZ.  Aggressiveness of cancer care near the end of life: is it a quality-of-care issue?  J Clin Oncol. 2008;26(23):3860-3866. doi:10.1200/JCO.2007.15.8253PubMedGoogle ScholarCrossref
    4.
    Saito  AM, Landrum  MB, Neville  BA, Ayanian  JZ, Earle  CC.  The effect on survival of continuing chemotherapy to near death.  BMC Palliat Care. 2011;10:14. doi:10.1186/1472-684X-10-14PubMedGoogle ScholarCrossref
    5.
    Prigerson  HG, Bao  Y, Shah  MA,  et al.  Chemotherapy use, performance status, and quality of life at the end of life.  JAMA Oncol. 2015;1(6):778-784. doi:10.1001/jamaoncol.2015.2378PubMedGoogle ScholarCrossref
    6.
    Schnipper  LE, Smith  TJ, Raghavan  D,  et al.  American Society of Clinical Oncology identifies five key opportunities to improve care and reduce costs: the top five list for oncology.  J Clin Oncol. 2012;30(14):1715-1724. doi:10.1200/JCO.2012.42.8375PubMedGoogle ScholarCrossref
    7.
    National Quality Forum. Cancer measures. https://www.qualityforum.org/News_And_Resources/Press_Releases/2012/NQF_Endorses_Cancer_Measures.aspx. August 10, 2012. Accessed July 1, 2017.
    8.
    Greer  JA, Pirl  WF, Jackson  VA,  et al.  Effect of early palliative care on chemotherapy use and end-of-life care in patients with metastatic non–small-cell lung cancer.  J Clin Oncol. 2012;30(4):394-400. doi:10.1200/JCO.2011.35.7996PubMedGoogle ScholarCrossref
    9.
    Satariano  WA, Ragland  DR.  The effect of comorbidity on 3-year survival of women with primary breast cancer.  Ann Intern Med. 1994;120(2):104-110. doi:10.7326/0003-4819-120-2-199401150-00002PubMedGoogle ScholarCrossref
    10.
    Hall  WH, Jani  AB, Ryu  JK, Narayan  S, Vijayakumar  S.  The impact of age and comorbidity on survival outcomes and treatment patterns in prostate cancer.  Prostate Cancer Prostatic Dis. 2005;8(1):22-30. doi:10.1038/sj.pcan.4500772PubMedGoogle ScholarCrossref
    11.
    Lee  L, Cheung  WY, Atkinson  E, Krzyzanowska  MK.  Impact of comorbidity on chemotherapy use and outcomes in solid tumors: a systematic review.  J Clin Oncol. 2011;29(1):106-117. doi:10.1200/JCO.2010.31.3049PubMedGoogle ScholarCrossref
    12.
    van Gestel  YRBM, Lemmens  VEPP, de Hingh  IHJT,  et al.  Influence of comorbidity and age on 1-, 2-, and 3-month postoperative mortality rates in gastrointestinal cancer patients.  Ann Surg Oncol. 2013;20(2):371-380. doi:10.1245/s10434-012-2663-1PubMedGoogle ScholarCrossref
    13.
    Sarfati  D, Koczwara  B, Jackson  C.  The impact of comorbidity on cancer and its treatment.  CA Cancer J Clin. 2016;66(4):337-350. doi:10.3322/caac.21342PubMedGoogle ScholarCrossref
    14.
    de Glas  NA, van de Water  W, Engelhardt  EG,  et al.  Validity of Adjuvant! online program in older patients with breast cancer: a population-based study.  Lancet Oncol. 2014;15(7):722-729. doi:10.1016/S1470-2045(14)70200-1PubMedGoogle ScholarCrossref
    15.
    Hoffmann  TC, Del Mar  C.  Clinicians’ expectations of the benefits and harms of treatments, screening, and tests: a systematic review.  JAMA Intern Med. 2017;177(3):407-419. doi:10.1001/jamainternmed.2016.8254PubMedGoogle ScholarCrossref
    16.
    Weeks  JC, Cook  EF, O’Day  SJ,  et al.  Relationship between cancer patients’ predictions of prognosis and their treatment preferences.  JAMA. 1998;279(21):1709-1714. doi:10.1001/jama.279.21.1709PubMedGoogle ScholarCrossref
    17.
    Rose  JH, O’Toole  EE, Dawson  NV,  et al.  Perspectives, preferences, care practices, and outcomes among older and middle-aged patients with late-stage cancer.  J Clin Oncol. 2004;22(24):4907-4917. doi:10.1200/JCO.2004.06.050PubMedGoogle ScholarCrossref
    18.
    Weeks  JC, Catalano  PJ, Cronin  A,  et al.  Patients’ expectations about effects of chemotherapy for advanced cancer.  N Engl J Med. 2012;367(17):1616-1625. doi:10.1056/NEJMoa1204410PubMedGoogle ScholarCrossref
    19.
    Temel  JS, Greer  JA, Admane  S,  et al.  Longitudinal perceptions of prognosis and goals of therapy in patients with metastatic non–small-cell lung cancer: results of a randomized study of early palliative care.  J Clin Oncol. 2011;29(17):2319-2326. doi:10.1200/JCO.2010.32.4459PubMedGoogle ScholarCrossref
    20.
    Glare  P, Virik  K, Jones  M,  et al.  A systematic review of physicians’ survival predictions in terminally ill cancer patients.  BMJ. 2003;327(7408):195-198. doi:10.1136/bmj.327.7408.195PubMedGoogle ScholarCrossref
    21.
    Stone  PC, Lund  S.  Predicting prognosis in patients with advanced cancer.  Ann Oncol. 2007;18(6):971-976. doi:10.1093/annonc/mdl343PubMedGoogle ScholarCrossref
    22.
    Brundage  MD, Davidson  JR, Mackillop  WJ.  Trading treatment toxicity for survival in locally advanced non–small cell lung cancer.  J Clin Oncol. 1997;15(1):330-340. doi:10.1200/JCO.1997.15.1.330PubMedGoogle ScholarCrossref
    23.
    Silvestri  G, Pritchard  R, Welch  HG.  Preferences for chemotherapy in patients with advanced non–small cell lung cancer: descriptive study based on scripted interviews.  BMJ. 1998;317(7161):771-775. doi:10.1136/bmj.317.7161.771PubMedGoogle ScholarCrossref
    24.
    Hirose  T, Yamaoka  T, Ohnishi  T,  et al.  Patient willingness to undergo chemotherapy and thoracic radiotherapy for locally advanced non–small cell lung cancer.  Psychooncology. 2009;18(5):483-489. doi:10.1002/pon.1450PubMedGoogle ScholarCrossref
    25.
    Keating  NL, Landrum  MB, Rogers  SO  Jr,  et al.  Physician factors associated with discussions about end-of-life care.  Cancer. 2010;116(4):998-1006. doi:10.1002/cncr.24761PubMedGoogle ScholarCrossref
    26.
    Keating  NL, Beth Landrum  M, Arora  NK,  et al.  Cancer patients’ roles in treatment decisions: do characteristics of the decision influence roles?  J Clin Oncol. 2010;28(28):4364-4370. doi:10.1200/JCO.2009.26.8870PubMedGoogle ScholarCrossref
    27.
    Liu  P-H, Landrum  MB, Weeks  JC,  et al.  Physicians’ propensity to discuss prognosis is associated with patients’ awareness of prognosis for metastatic cancers.  J Palliat Med. 2014;17(6):673-682. doi:10.1089/jpm.2013.0460PubMedGoogle ScholarCrossref
    28.
    National Cancer Institute Surveillance, Epidemiology, and End Results Program. Cancer statistics. https://seer.cancer.gov/statistics. Accessed July 1, 2017.
    29.
    Gwilliam  B, Keeley  V, Todd  C,  et al.  Development of prognosis in palliative care study (PiPS) predictor models to improve prognostication in advanced cancer: prospective cohort study.  BMJ. 2011;343:d4920. doi:10.1136/bmj.d4920PubMedGoogle ScholarCrossref
    30.
    Rumball-Smith  J, Shekelle  PG, Bates  DW.  Using the electronic health record to understand and minimize overuse.  JAMA. 2017;317(3):257-258. doi:10.1001/jama.2016.18609PubMedGoogle ScholarCrossref
    31.
    Healthcare Cost and Utilization Project (HCUP). Clinical Classifications Software for Services and Procedures. https://www.hcup-us.ahrq.gov/toolssoftware/ccs_svcsproc/ccssvcproc.jsp. Accessed January 19, 2017.
    32.
    Cooper  GS, Yuan  Z, Stange  KC, Amini  SB, Dennis  LK, Rimm  AA.  The utility of Medicare claims data for measuring cancer stage.  Med Care. 1999;37(7):706-711. doi:10.1097/00005650-199907000-00010PubMedGoogle ScholarCrossref
    33.
    Meurer  WJ, Tolles  J.  Logistic regression diagnostics: understanding how well a model predicts outcomes.  JAMA. 2017;317(10):1068-1069. doi:10.1001/jama.2016.20441PubMedGoogle ScholarCrossref
    34.
    DeLong  ER, DeLong  DM, Clarke-Pearson  DL.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.  Biometrics. 1988;44(3):837-845. doi:10.2307/2531595PubMedGoogle ScholarCrossref
    35.
    Makar  M, Ghassemi  M, Cutler  DM, Obermeyer  Z.  Short-term mortality prediction for elderly patients using Medicare claims data.  Int J Mach Learn Comput. 2015;5(3):192-197. doi:10.7763/IJMLC.2015.V5.506PubMedGoogle ScholarCrossref
    36.
    Goldman  L, Weinberg  M, Weisberg  M,  et al.  A computer-derived protocol to aid in the diagnosis of emergency room patients with acute chest pain.  N Engl J Med. 1982;307(10):588-596. doi:10.1056/NEJM198209023071004PubMedGoogle ScholarCrossref
    37.
    Chen  T, He  T. xgboost: eXtreme Gradient Boosting. https://cran.r-project.org/web/packages/xgboost/vignettes/xgboost.pdf. May 15, 2018. Accessed July 1, 2017.
    38.
    Shouval  R, Labopin  M, Bondi  O,  et al.  Prediction of allogeneic hematopoietic stem-cell transplantation mortality 100 days after transplantation using a machine learning algorithm: a European Group for Blood and Marrow Transplantation Acute Leukemia Working Party Retrospective Data Mining Study.  J Clin Oncol. 2015;33(28):3144-3151. doi:10.1200/JCO.2014.59.1339PubMedGoogle ScholarCrossref
    39.
    Gagne  JJ, Glynn  RJ, Avorn  J, Levin  R, Schneeweiss  S.  A combined comorbidity score predicted mortality in elderly patients better than existing scores.  J Clin Epidemiol. 2011;64(7):749-759. doi:10.1016/j.jclinepi.2010.10.004PubMedGoogle ScholarCrossref
    40.
    Saltz  LB, Clarke  S, Díaz-Rubio  E,  et al.  Bevacizumab in combination with oxaliplatin-based chemotherapy as first-line therapy in metastatic colorectal cancer: a randomized phase III study.  J Clin Oncol. 2008;26(12):2013-2019. doi:10.1200/JCO.2007.14.9930PubMedGoogle ScholarCrossref
    41.
    Zukin  M, Barrios  CH, Pereira  JR,  et al.  Randomized phase III trial of single-agent pemetrexed versus carboplatin and pemetrexed in patients with advanced non–small-cell lung cancer and Eastern Cooperative Oncology Group performance status of 2.  J Clin Oncol. 2013;31(23):2849-2853. doi:10.1200/JCO.2012.48.1911PubMedGoogle ScholarCrossref
    42.
    Noda  K, Nishiwaki  Y, Kawahara  M,  et al; Japan Clinical Oncology Group.  Irinotecan plus cisplatin compared with etoposide plus cisplatin for extensive small-cell lung cancer.  N Engl J Med. 2002;346(2):85-91. doi:10.1056/NEJMoa003034PubMedGoogle ScholarCrossref
    43.
    Clark  JI, Hofmeister  C, Choudhury  A,  et al.  Phase II evaluation of paclitaxel in combination with carboplatin in advanced head and neck carcinoma.  Cancer. 2001;92(9):2334-2340. doi:10.1002/1097-0142(20011101)92:9<2334::AID-CNCR1580>3.0.CO;2-3PubMedGoogle ScholarCrossref
    44.
    Wallington  M, Saxon  EB, Bomb  M,  et al.  30-day mortality after systemic anticancer treatment for breast and lung cancer in England: a population-based, observational study.  Lancet Oncol. 2016;17(9):1203-1216. doi:10.1016/S1470-2045(16)30383-7PubMedGoogle ScholarCrossref
    45.
    Moreno-Torres  JG, Raeder  T, Alaiz-Rodríguez  R, Chawla  NV, Herrera  F.  A unifying view on dataset shift in classification.  Pattern Recognit. 2012;45(1):521-530. doi:10.1016/j.patcog.2011.06.019Google ScholarCrossref
    ×