Key Points español 中文 (chinese)
Can a machine learning algorithm applied to electronic health record data predict patients’ short-term risk of death at the time that they begin chemotherapy?
In this cohort study of 26 946 patients with cancer starting 51 774 discrete chemotherapy regimens, those at high risk of 30-day mortality were accurately identified across palliative and curative chemotherapy regimens and many types and stages of cancer. The algorithm was more accurate than predictions based on randomized clinical trials or population-based registry data.
A machine learning algorithm accurately identified individuals at high risk of short-term mortality and may help to guide patient and physician decisions about chemotherapy initiation and advance care planning.
Patients with cancer who die soon after starting chemotherapy incur costs of treatment without the benefits. Accurately predicting mortality risk before administering chemotherapy is important, but few patient data–driven tools exist.
To create and validate a machine learning model that predicts mortality in a general oncology cohort starting new chemotherapy, using only data available before the first day of treatment.
Design, Setting, and Participants
This retrospective cohort study of patients at a large academic cancer center from January 1, 2004, through December 31, 2014, determined date of death by linkage to Social Security data. The model was derived using data from 2004 through 2011, and performance was measured on nonoverlapping data from 2012 through 2014. The analysis was conducted from June 1 through August 1, 2017. Participants included 26 946 patients starting 51 774 new chemotherapy regimens.
Main Outcomes and Measures
Thirty-day mortality from the first day of a new chemotherapy regimen. Secondary outcomes included model discrimination by predicted mortality risk decile among patients receiving palliative chemotherapy, and 180-day mortality from the first day of a new chemotherapy regimen.
Among the 26 946 patients included in the analysis, mean age was 58.7 years (95% CI, 58.5-58.9 years); 61.1% were female (95% CI, 60.4%-61.9%); and 86.9% were white (95% CI, 86.4%-87.4%). Thirty-day mortality from chemotherapy start was 2.1% (95% CI, 1.9%-2.4%). Among the 9114 patients in the validation set, the most common primary cancers were breast (21.1%; 95% CI, 20.2%-21.9%), colorectal (19.3%; 95% CI, 18.5%-20.2%), and lung (18.0%; 95% CI, 17.2%-18.8%). Model predictions were accurate for all patients (area under the curve [AUC], 0.940; 95% CI, 0.930-0.951). Predictions for patients starting palliative chemotherapy (46.6% of regimens; 95% CI, 45.8%-47.3%), for whom prognosis is particularly important, remained highly accurate (AUC, 0.924; 95% CI, 0.910-0.939). To illustrate model discrimination, patients were ranked initiating palliative chemotherapy by model-predicted mortality risk, and observed mortality was calculated by risk decile. Thirty-day mortality in the highest-risk decile was 22.6% (95% CI, 19.6%-25.6%); in the lowest-risk decile, no patients died. Predictions remained accurate across all primary cancers, stages, and chemotherapies, even for clinical trial regimens that first appeared in years after the model was trained (AUC, 0.942; 95% CI, 0.882-1.000). The same model also performed well for prediction of 180-day mortality (AUC for all patients, 0.870 [95% CI, 0.862-0.877]; highest- vs lowest-risk decile mortality, 74.8% [95% CI, 72.7%-77.0%] vs 0.2% [95% CI, 0.01%-0.4%]). Predictions were more accurate than estimates from randomized clinical trials of individual chemotherapies or the Surveillance, Epidemiology, and End Results data set.
Conclusions and Relevance
A machine learning algorithm using electronic health record data accurately predicted short-term mortality among patients starting chemotherapy. Further research is necessary to determine the generalizability and feasibility of applying this algorithm in clinical settings.
Chemotherapy lowers the risk of recurrence in early-stage cancers and can improve survival and symptoms in later-stage disease. Balancing these benefits against chemotherapy’s considerable risks is challenging. Increasing evidence suggests that chemotherapy is too often started too late in the cancer disease trajectory,1-4 and many patients die soon after initiating treatment. These patients experience burdensome symptoms without many of the potential benefits of chemotherapy.5 National organizations now track the proportion of patients who die within 2 weeks of receiving chemotherapy as a marker of poor quality of care,6,7 and this number has been increasing rapidly.1,8
A key factor underlying these trends is the difficulty of accurately identifying the risk of serious adverse events, especially death, before initiating chemotherapy. Adverse effects of chemotherapy are variable, and the influence of comorbidities is complex; thus, the risk calculus of administering chemotherapy is challenging.9-13 Cognitive biases also lead to underestimation of the risk of death,14,15 particularly in patients with metastatic cancer,16,17 who often believe that their disease is curable.18,19 Physicians do not accurately estimate prognosis in patients with cancer,20,21 and overly optimistic estimates can influence patients’ chemotherapy decisions.22-27
To estimate mortality before initiation of chemotherapy, physicians may reference randomized clinical trial (RCT) data for individual regimens or population-level data such as the Surveillance, Epidemiology, and End Results (SEER) data set to obtain mortality risk by age, sex, and primary cancer.14,28 Although informative, these tools provide mortality estimates for broad populations of patients and often do not accurately estimate a specific individual’s mortality. Individualized decision support tools exist29 but require a substantial investment of time and resources; these tools require clinicians to collect and enter data not readily available in existing records, which limits the number of variables that can be used and adds complexity to workflows.
There is considerable enthusiasm for the role of advanced algorithms to improve prediction; just as modern electronic health records (EHRs) pull complex data for clinicians to use in real time, algorithms could pull and process these data in parallel, presenting accurate probability forecasts to clinicians and patients.30 However, little evidence suggests that such algorithms can provide meaningful inputs to clinical decision making in cancer or elsewhere.
New chemotherapy is a critical event in the disease trajectory of cancer, and objective predictions of short-term mortality at this time could be useful to physicians and patients in several ways. Accurate forecasts of the risks of mortality and adverse events could inform discussions of risks and benefits of chemotherapy, particularly for patients undergoing palliative chemotherapy, and could help guide important decisions regarding advance care planning and palliative care consultation. In this study, we developed and applied a machine learning algorithm to predict near-term mortality risk in a large cohort of patients with cancer starting new chemotherapy regimens.
We obtained EHR data for all patients receiving chemotherapy at the Dana-Farber/Brigham and Women’s Cancer Center (DF/BWCC), Boston, Massachusetts, from January 1, 2004, through December 31, 2014. We determined date of death by linking to the Social Security Administration’s Death Master File. We classified patients by primary cancer and presence of distant-stage disease, determined using registry data (for patients diagnosed at DF/BWCC) and International Classification of Diseases, Ninth Revision (ICD-9) codes for metastases (for patients not diagnosed at DF/BWCC or who did not have registry data and to identify progression to distant-stage disease in those previously diagnosed at DF/BWCC).31 Although diagnosis codes have limitations for determination of cancer stage, they are generally believed to provide reliable identification of the presence or the absence of distant-stage disease.32 Our study followed the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) checklist for prediction model development and validation (eMethods in the Supplement).The institutional review boards of Dana-Farber Cancer Institute and Partners HealthCare, Boston, approved this study and granted a waiver of informed consent from study participants.
Data were analyzed from June 1 through August 1, 2017. Our primary outcome was death within 30 days of starting new systemic chemotherapy regimens. Secondary outcomes were 30-day mortality in prespecified subgroups of interest (described later) and overall 180-day mortality. We constructed our data set at the patient–chemotherapy regimen level, such that each regimen was a new observation.
Machine learning models have the potential to overfit or produce overly optimistic estimates of model performance based on spurious correlations in development data. We thus report results only in an independent validation set, which played no role in model development; as such, overfitting would only lead to poorer model performance in the validation set. Specifically, we used data from 2004 through 2011 for model derivation and data from 2012 through 2014 for model validation. Because our data set was constructed at the patient–chemotherapy regimen level, observations describing different chemotherapy regimens in the same patient are not independent. For patients whose observations appeared before and after January 1, 2012, we randomly assigned all observations from a given patient to the derivation or the validation set, so that no patient appeared in both sets.
Our primary measure of model performance was the area under the receiver operating characteristic curve (AUC),33 which we calculated by comparing the mortality probability estimate from the machine learning model with observed mortality. We calculated 95% CIs of the AUC following the method of DeLong et al.34 We report AUC overall and in subgroups of clinical interest, notably age, sex, race/ethnicity, distant-stage disease, individual primary cancers, chemotherapy lines and regimens, and chemotherapy intent (palliative vs curative, identified by the treating physician and recorded as an EHR flag). To benchmark against existing prognostic models, we obtained 1-year mortality estimates from large RCTs of specific chemotherapy regimens and from the SEER program for available subgroups of patients. To give more clinically relevant metrics of predictive accuracy, we also present mortality rates in given deciles of model-predicted risk, typically highest and lowest. When presenting variable summary statistics, we report CIs for means and proportions and the first and third quartiles for medians.
To transform raw EHR data into variables usable in a prediction model, we first pulled all data from the 1-year period ending the day before chemotherapy initiation (we did not drop patients based on absence of data during this period). Raw data were aggregated into 23 641 potential predictors in the following categories: demographics, prescribed medications, comorbidities and other grouped ICD-9 diagnoses, procedures,31 use of health care resources, vital signs, laboratory results, and terms derived from physician notes using natural language processing.31 For each potential predictor, we created the the statistical summary of related EHR entries for 1 month (recent) and for 2 to 12 months (baseline) before chemotherapy initiation. This strategy is outlined in more detail elsewhere.35 We also included a variable indexing how many lines of chemotherapy the patient had in total before the current regimen. No data on the current regimen itself (eg, agent, intent) were used in the predictive model. We dropped variables missing in more than 99% of the derivation sample, leaving 5390 predictors in the model.
We used gradient-boosted trees, a linear combination of decision trees similar to those used to derive many clinical decision rules to handle large sets of correlated predictors (R package: xgboost).36,37 We used 4-fold cross-validation in the development sample to choose model variables (eg, number of trees, variables per tree). The model was configured to produce individual-level probabilities of 30-day mortality. More details are available in eMethods in the Supplement.
Each split of each tree in the model (eg, a split on sex) had a default, which is the value (eg, male or female) that occurred more frequently in the training data. Observations with missing values for a given variable were assigned to the default side of the split. This process was effectively a split-specific, probabilistic imputation function that allowed us to avoid excluding observations that were missing data.
We decomposed model predictions into the linear contributions of individual variables. We calculated the (linear) sum of squares for individual variables included in the machine learning model and interpreted the residual sum of squares as the contribution of nonlinear terms and interactions used by the model. Because our model used more than 5000 predictors, we chose to report on only a small selection, specifically (1) those that most explained model variance and (2) those identified as predictors of mortality in prior studies.29,38,39 Details on our calculation of the model variance explained by individual predictors are in eMethods in the Supplement.
We identified 26 946 patients who initiated 51 774 discrete chemotherapy regimens from 2004 through 2014; 59.4% had distant-stage disease. Table 1 shows patient characteristics at the time of chemotherapy initiation. Mean patient age was 58.7 years (95% CI, 58.5-58.9 years), 61.1% were female (95% CI, 60.4%-61.9%), and 86.9% were white (95% CI, 86.4%-87.4%). The most common chemotherapy regimens (derivation and validation sets) were carboplatin and paclitaxel (n = 4042), gemcitabine hydrochloride (n = 2185), and albumin-bound paclitaxel (n = 1985); 3.3% of chemotherapy regimens in the validation set (n = 523) were chemotherapy regimens that first appeared in 2012 or later and thus did not appear in the derivation set. Experimental agents not approved by the US Food and Drug Administration constituted 2.2% (n = 343) of all chemotherapy regimens in the validation set.
Among the 9114 patients in the validation set, overall 30-day mortality was 2.1% (95% CI, 1.9%-2.4%). The most common primary cancers were breast (21.1%; 95% CI, 20.2%-21.9%), colorectal (19.3%; 95% CI, 18.5%-20.2%), and lung (18.0%; 95% CI, 17.2%-18.8%). The model accurately predicted 30-day mortality for all patients, irrespective of chemotherapy intent (AUC, 0.940; 95% CI, 0.930-0.951). In the subset of patients receiving palliative chemotherapy (46.6% of regimens; 95% CI, 45.8%-47.3%), 30-day mortality was 3.1% (95% CI, 2.7%-3.5%). Prognostic estimates are likely to be particularly important for these patients, and the model also performed well for this situation, with an AUC of 0.924 (95% CI, 0.910-0.939). To illustrate the clinical implications of this accuracy, we used model predictions to individually rank patients by 30-day mortality risk, a commonly used way of stratifying risk groups.33 Thirty-day mortality in the highest decile of predicted risk for palliative-intent chemotherapy was 22.6% (95% CI, 19.6%-25.6%), whereas in the lowest-risk decile, no patients died.
Figure 1 shows observed survival during the 180 days after palliative chemotherapy initiation by decile of model predictions (patients were followed up to 180 days). Overall 180-day mortality among all patients was 18.4% (95% CI, 17.8%-19.0%); for those initiating palliative chemotherapy, 180-day mortality was 27.9% (95% CI, 26.9%-28.9%). Model predictions on 30-day mortality were also accurate predictors of 180-day mortality (AUC, 0.827; 95% CI, 0.817-0.838); in the highest-risk decile, 180-day mortality was 74.8% (95% CI, 72.7%-77.0%) vs 0.2% (95% CI, 0.01%-0.4%) in the lowest-risk decile. Predictions were even more accurate for all patients, irrespective of chemotherapy intent (AUC, 0.870; 95% CI, 0.862-0.877); 180-day survival among these patients is shown in the eFigure in the Supplement.
Table 2 shows model performance for predicting 30-day mortality in additional patient subgroups of interest. The model performed equally well across many kinds of primary cancers, demographic groups, and chemotherapy regimens. In distant-stage disease (mean 30-day mortality, 2.9%; 95% CI, 2.5%-3.2%), 30-day mortality in the highest-risk decile was 22.7% (95% CI, 19.9%-25.6%) vs 0 in the lowest decile (AUC, 0.924; 95% CI, 0.910-0.939). Predictions were accurate even for experimental clinical trial regimens first used from 2012 to 2014 (AUC, 0.942; 95% CI, 0.882-1.000); the derivation model was not exposed to these novel regimens in the training process.
A key question is whether model predictions are accurate enough to be useful across a range of primary cancers, stages of disease, or lines of chemotherapy, which constitute scenarios for which prognoses vary widely. Table 2 thus also presents measures of overall predictive accuracy for first-line chemotherapy (AUC for 30-day mortality, 0.941 [95% CI, 0.925-0.956]; AUC for 180-day mortality, 0.865 [95% CI, 0.854-0.875]) compared with later lines of chemotherapy (AUC for 30-day mortality, 0.938 [95% CI, 0.924-0.952]; AUC for 180-day mortality, 0.864 [95% CI, 0.854-9.874]). eTable 1 in the Supplement presents extended results on model performance for 30- and 180-day mortality across lung, colorectal, breast, and prostate cancers by stage and line of chemotherapy.
Comparisons With Other Prognostic Estimates
We compared model performance with 2 external sources of mortality estimates, focusing on patients with distant-stage disease. First, we obtained mortality data from 4 RCTs of treatments for colorectal adenocarcinoma, non–small cell lung adenocarcinoma, small cell lung carcinoma, and squamous cell carcinoma of the head and neck.40-43 Figure 2A-D shows observed mortality for patients in our validation sample who started specific chemotherapy regimens for which trial data are available. (We chose to show 1-year mortality because this is the only time window reported consistently in RCTs.) We compared observed mortality with 2 sources of predictions: (1) RCT data (ie, mean 1-year mortality for patients receiving the relevant chemotherapy regimen) and (2) 1-year mortality risk estimates from our model; to generate these, we calculated 1-year mortality in the derivation set for patients in each quintile of model-predicted risk (we could not use raw model predictions because these were designed to predict 30-day mortality). The overall AUC for RCT estimates was 0.555 (95% CI, 0.513-0.598) compared with 0.771 (95% CI, 0.735-0.808) for model-based estimates for these same patients.
We also compared our model predictions of mortality with age-, sex-, race-, and cancer-specific mortality estimates from SEER, restricted to patients with advanced-stage cancers of the lung and bronchus, colon and rectum, breast, and prostate to maximize comparability in populations. Figure 2E-H shows that our model predictions (AUC, 0.810; 95% CI, 0.799-0.822) outperformed SEER estimates (AUC, 0.600; 95% CI, 0.585-0.615) for 1-year mortality. Further details on construction of RCT and SEER estimates are available in the eMethods and eTable 2 in the Supplement, and more detailed comparisons for subgroups are available in eTable 3 in the Supplement.
Table 3 shows the distribution of key predictor variables used in the prediction model across risk deciles, as well as the proportion of model variance explained linearly by each variable. In general, key predictors of mortality identified in the literature were markedly different in the highest vs lowest model-predicted risk deciles; these predictors included summed comorbidity score,39 age,38 failure to thrive, heart rate, and certain laboratory data (eg, C-reactive protein level, white blood cell count, and alkaline phosphatase level).29 Of importance, no single variable explained more than 2% of model predictions in linear fashion. Most of the variation in the predictions (86.4%) was not a linear function of any single predictor, indicating that the tree-based model relied heavily on complex nonlinear functional forms and interactions among variables.
A machine learning model based on single-center EHR data accurately estimated individual mortality risk in a cohort of patients with cancer at the time of chemotherapy initiation. The model performed well across a range of cancer types, race, sex, and other demographic variables. Mortality estimates were accurate for chemotherapy regimens with palliative and curative intent, for patients with early- and distant-stage cancer, and for patients treated with clinical trial regimens introduced in years after the model was trained. Our model outperformed estimates from RCTs and SEER data, both of which are routinely used by clinicians for quantitative risk predictions.
This model was able to predict mortality with considerable accuracy despite lacking genetic sequencing data, cancer-specific biomarkers, or any detailed information about cancers beyond EHR data. This accuracy underscores the fact that common clinical data elements contained within an EHR (eg, symptoms, comorbidities, prescribed medications, and diagnostic tests) contain surprising amounts of signal for predicting key outcomes in patients with cancer.
One clinically useful advantage of our algorithm is that it would not require manual input from clinicians. Current validated prognostic algorithms require considerable, often difficult input on the part of clinicians. For example, the palliative prognostic score relies on 6 weighted variables; some of these data elements, such as Karnofsky performance status, are not routinely available in the EHR and thus require manual input and calculation.21
In contrast, our prognostic algorithm could pull directly from the EHR without manual input. Most inputs to our model are standard data elements in structured format in EHRs, including ICD-9 and procedure codes and medications. Although our algorithm was developed using a single institution’s data, its inputs are available nearly everywhere with an EHR. In addition, no special infrastructure is required to pull these data from an institution’s data warehouse; in the same way that today’s EHR systems pull a rich set of data from a database to present it to clinicians, an algorithm could pull and process the same data in real time using the processing power on a desktop computer. Although machine learning algorithms require significant computing infrastructure to construct, once derived, they can be applied using minimal computing power already available in any hospital computers running an EHR or even on a smartphone. This application facilitates potential integration into existing clinical systems. Thus, we would not anticipate major technical barriers to implementing this or similar algorithms in any organization’s clinical data to independently validate predictive power from a sample. To this end, code for our algorithm is publicly available (eResults in the Supplement and http://labsysmed.org/wp-content/uploads/2017/02/ChemoMortalityAnalysis.rtf).
Algorithmic predictions such as ours could be useful at several points along the care continuum. They could provide accurate predictions of mortality risk to a clinician or foster shared decision making between the patient and clinician. Short-term estimates of mortality could help clinicians identify patients unlikely to benefit from chemotherapy beyond 30 days and those who may benefit from early palliative care referral, advance care planning, and prompting to get financial and family affairs in order. For patients receiving systemic chemotherapy, an estimate of 30-day mortality risk may be a useful quality indicator of avoidable treatment-associated harm.44
This study has several limitations. Our model was built on data from patients treated with chemotherapy and is thus unlikely to be accurate for untreated patients. Second, our treated sample reflects the particular decisions around chemotherapy made by physicians and patients in our training data set. Patients who were eligible for chemotherapy but for some reason did not start it were not included, which could have biased the sample. However, it is likely that the direction of this bias is that prevailing treatment decisions are generally aggressive. In our sample, 62.4% of patients with distant-stage disease received chemotherapy, suggesting that physician recommendations and patient acceptance of those decisions generally lead to initiation of treatment. This finding fits with a large body of evidence suggesting that physicians in a wide range of settings overestimate survival and overuse chemotherapy. Thus, to the extent that our data set has bias, it leads to the inclusion—not exclusion—of patients who otherwise might not have received chemotherapy. As a result, we believe that this bias did not substantially distort validity. If such an algorithm were deployed in a real-world setting, periodic retraining of the model (eg, each year or quarter) would ensure that model predictions reflected contemporaneous chemotherapy decision making. This process would address changing selection into treatment over time and update the model to reflect broader changes in patient populations and chemotherapy technology.
Several significant differences between the 2004-2011 derivation set and the 2012-2014 validation set include age at initiation, race, primary cancer, and prior chemotherapy beyond the first-line treatment. Such differences between derivation and validation sets are expected and intentional: a validation set drawn from later years of data was chosen to reflect the constant evolution of cancer epidemiology and treatment. This process made the prediction task more difficult because algorithms trained on past data cannot always perform well in the future.45 However, changes in referral patterns, chemotherapy, and diagnosis patterns are just some of the difficulties associated with algorithms in evolving real-world settings. We are reassured that performance was good despite these and other secular trends.
Although we quantified predictive accuracy in an independent, recent validation set, the only way to truly validate such a model is prospectively. A model trained on pre-2012 data may lose accuracy as novel tumor diagnostics and therapies arise, although the accuracy of predictions for patients starting novel chemotherapies was encouraging in this regard. In addition, this study included data from a single institution. Further validation is required using cohorts from different institutions. Electronic health record data contain a multitude of biases introduced by physician behavior, institutional idiosyncrasies, and software platforms, among other limitations. These limitations can significantly affect the adaptability and relevance of our prediction model to different care settings.
Our machine learning model accurately predicted mortality risk in patients at the time of chemotherapy initiation. Although we are optimistic that accurate prognostic tools such as this could help to promote value-driven oncology care, the ideal next step would be an RCT of algorithmic estimates at the point of care. To be useful, predictive models must improve decision making in the real world. Thus, rigorous evaluation of predictions’ influence on outcomes is the criterion standard test but one that is often neglected in the literature, which focuses primarily on measuring predictive accuracy rather than real outcomes.
Accepted for Publication: April 28, 2018.
Published: July 27, 2018. doi:10.1001/jamanetworkopen.2018.0926
Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2018 Elfiky AA et al. JAMA Network Open.
Corresponding Author: Ziad Obermeyer, MD, MPhil, Brigham and Women’s Hospital, 75 Francis St, Neville House, Boston, MA 02115 (firstname.lastname@example.org).
Author Contributions: Drs Elfiky and Obermeyer had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Elfiky, Pany, Obermeyer.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: All authors.
Obtained funding: Elfiky, Obermeyer.
Administrative, technical, or material support: Parikh.
Supervision: Elfiky, Obermeyer.
Conflict of Interest Disclosures: Dr Parikh reported personal fees from GNS Healthcare outside the submitted work. No other disclosures were reported.
Funding/Support: This study was supported by grants DP5OD012161 from the Office of the Director and R56AG055728 from the National Institute on Aging (Dr Obermeyer), training grant T32 AG51108 from the National Institute on Aging (Mr Pany), and a grant from the Dana-Farber Cancer Institute (Dr Elfiky).
Role of the Funder/Sponsor: The funding sources had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
et al. American Society of Clinical Oncology identifies five key opportunities to improve care and reduce costs: the top five list for oncology. J Clin Oncol
. 2012;30(14):1715-1724. doi:10.1200/JCO.2012.42.8375PubMedGoogle ScholarCrossref
et al. Longitudinal perceptions of prognosis and goals of therapy in patients with metastatic non–small-cell lung cancer: results of a randomized study of early palliative care. J Clin Oncol
. 2011;29(17):2319-2326. doi:10.1200/JCO.2010.32.4459PubMedGoogle ScholarCrossref
et al. Patient willingness to undergo chemotherapy and thoracic radiotherapy for locally advanced non–small cell lung cancer. Psychooncology
. 2009;18(5):483-489. doi:10.1002/pon.1450PubMedGoogle ScholarCrossref
et al. Development of prognosis in palliative care study (PiPS) predictor models to improve prognostication in advanced cancer: prospective cohort study. BMJ
. 2011;343:d4920. doi:10.1136/bmj.d4920PubMedGoogle ScholarCrossref
DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics
. 1988;44(3):837-845. doi:10.2307/2531595PubMedGoogle ScholarCrossref
et al. Prediction of allogeneic hematopoietic stem-cell transplantation mortality 100 days after transplantation using a machine learning algorithm: a European Group for Blood and Marrow Transplantation Acute Leukemia Working Party Retrospective Data Mining Study. J Clin Oncol
. 2015;33(28):3144-3151. doi:10.1200/JCO.2014.59.1339PubMedGoogle ScholarCrossref
et al. Bevacizumab in combination with oxaliplatin-based chemotherapy as first-line therapy in metastatic colorectal cancer: a randomized phase III study. J Clin Oncol
. 2008;26(12):2013-2019. doi:10.1200/JCO.2007.14.9930PubMedGoogle ScholarCrossref
et al. Randomized phase III trial of single-agent pemetrexed versus carboplatin and pemetrexed in patients with advanced non–small-cell lung cancer and Eastern Cooperative Oncology Group performance status of 2. J Clin Oncol
. 2013;31(23):2849-2853. doi:10.1200/JCO.2012.48.1911PubMedGoogle ScholarCrossref
et al; Japan Clinical Oncology Group. Irinotecan plus cisplatin compared with etoposide plus cisplatin for extensive small-cell lung cancer. N Engl J Med
. 2002;346(2):85-91. doi:10.1056/NEJMoa003034PubMedGoogle ScholarCrossref