Development, Validation, and Evaluation of a Simple Machine Learning Model to Predict Cirrhosis Mortality | Hepatobiliary Disease | JAMA Network Open | JAMA Network
[Skip to Navigation]
Sign In
Figure 1.  Annual and Cumulative Incidence of All-Cause Mortality
Annual and Cumulative Incidence of All-Cause Mortality

Includes 107 939 participants at inception. Annual all-cause mortality was 8.8% at 1 year, 15.3% at 2 years, 12.8% at 3 years, 11.5% at 4 years, 10.9% at 5 years, 10.4% at 6 years, 9.8% at 7 years, and 10.3% at 8 years. Cumulative all-cause mortality was 8.8% at 1 year, 22.8% at 2 years, 32.7% at 3 years, 40.4% at 4 years, 46.2% at 5 years, 50.2% at 6 years, 52.6% at 7 years, and 53.4% at 8 years.

Figure 2.  Associations Between Clinical Factors Included in Cirrhosis Mortality Model and Time to Death
Associations Between Clinical Factors Included in Cirrhosis Mortality Model and Time to Death

ALT indicates alanine aminotransferase; AST, aspartate aminotransferase; CirCom, cirrhosis-specific comorbidity score; and RR, relative risk.

Table 1.  Baseline Characteristics of 107 939 Patients With Cirrhosis
Baseline Characteristics of 107 939 Patients With Cirrhosis
Table 2.  Discrimination and Calibration of 3 Modeling Approaches
Discrimination and Calibration of 3 Modeling Approaches
Table 3.  Comparison of Discrimination Between the CiMM and MELD-Na Scorea
Comparison of Discrimination Between the CiMM and MELD-Na Scorea
1.
Lip  GY, Nieuwlaat  R, Pisters  R, Lane  DA, Crijns  HJ.  Refining clinical risk stratification for predicting stroke and thromboembolism in atrial fibrillation using a novel risk factor-based approach: the Euro Heart Survey on Atrial Fibrillation.   Chest. 2010;137(2):263-272. doi:10.1378/chest.09-1584 PubMedGoogle ScholarCrossref
2.
Friberg  L, Rosenqvist  M, Lip  GY.  Evaluation of risk stratification schemes for ischaemic stroke and bleeding in 182 678 patients with atrial fibrillation: the Swedish Atrial Fibrillation cohort study.   Eur Heart J. 2012;33(12):1500-1510. doi:10.1093/eurheartj/ehr488 PubMedGoogle ScholarCrossref
3.
American Heart Association. Heart risk calculator. Published 2013. Accessed June 2, 2020. http://www.cvriskcalculator.com/
4.
Capuzzo  M, Valpondi  V, Sgarbi  A,  et al.  Validation of severity scoring systems SAPS II and APACHE II in a single-center population.   Intensive Care Med. 2000;26(12):1779-1785. doi:10.1007/s001340000715 PubMedGoogle ScholarCrossref
5.
Rudin  C.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.   Nature Machine Intelligence. 2019;1(5):206-215. doi:10.1038/s42256-019-0048-x Google ScholarCrossref
6.
Kamath  PS, Wiesner  RH, Malinchoc  M,  et al.  A model to predict survival in patients with end-stage liver disease.   Hepatology. 2001;33(2):464-470. doi:10.1053/jhep.2001.22172 PubMedGoogle ScholarCrossref
7.
Kaplan  DE, Dai  F, Aytaman  A,  et al. Development and performance of an algorithm to estimate the Child-Turcotte-Pugh Score from a national electronic healthcare database.  Clin Gastroenterol Hepatol. 2015;13(13):2333-2341.e6. doi:10.1016/j.cgh.2015.07.010
8.
Sarmast  N, Ogola  GO, Kouznetsova  M,  et al.  Model for end-stage liver disease-lactate and prediction of inpatient mortality in patients with chronic liver disease.   Hepatology. 2020. doi:10.1002/hep.31199 PubMedGoogle Scholar
9.
Mahmud  N, Hubbard  RA, Kaplan  DE, Taddei  TH, Goldberg  DS.  Risk prediction scores for acute on chronic liver failure development and mortality.   Liver Int. 2020;40(5):1159-1167. doi:10.1111/liv.14328 PubMedGoogle ScholarCrossref
10.
Koola  JD, Ho  S, Chen  G,  et al.  Development of a national Department of Veterans Affairs mortality risk prediction model among patients with cirrhosis.   BMJ Open Gastroenterol. 2019;6(1):e000342. doi:10.1136/bmjgast-2019-000342 PubMedGoogle Scholar
11.
Buchanan  PM, Kramer  JR, El-Serag  HB,  et al.  The quality of care provided to patients with varices in the department of Veterans Affairs.   Am J Gastroenterol. 2014;109(7):934-940. doi:10.1038/ajg.2013.487 PubMedGoogle ScholarCrossref
12.
Kanwal  F, Kramer  JR, Buchanan  P,  et al.  The quality of care provided to patients with cirrhosis and ascites in the Department of Veterans Affairs.   Gastroenterology. 2012;143(1):70-77. doi:10.1053/j.gastro.2012.03.038 PubMedGoogle ScholarCrossref
13.
Sohn  MW, Arnold  N, Maynard  C, Hynes  DM.  Accuracy and completeness of mortality data in the Department of Veterans Affairs.   Popul Health Metr. 2006;4:2. doi:10.1186/1478-7954-4-2 PubMedGoogle ScholarCrossref
14.
Wang  L, Porter  B, Maynard  C,  et al.  Predicting risk of hospitalization or death among patients receiving primary care in the Veterans Health Administration.   Med Care. 2013;51(4):368-373. doi:10.1097/MLR.0b013e31827da95a PubMedGoogle ScholarCrossref
15.
Kanwal  F, Kramer  JR, Ilyas  J, Duan  Z, El-Serag  HB.  HCV genotype 3 is associated with an increased risk of cirrhosis and hepatocellular cancer in a national sample of US veterans with HCV.   Hepatology. 2014;60(1):98-105. doi:10.1002/hep.27095 PubMedGoogle ScholarCrossref
16.
Kruse  RL, Kramer  JR, Tyson  GL,  et al.  Clinical outcomes of hepatitis B virus coinfection in a United States cohort of hepatitis C virus–infected patients.   Hepatology. 2014;60(6):1871-1878. doi:10.1002/hep.27337 PubMedGoogle ScholarCrossref
17.
El-Serag  HB, Kanwal  F, Richardson  P, Kramer  J.  Risk of hepatocellular carcinoma after sustained virological response in veterans with hepatitis C virus infection.   Hepatology. 2016;64(1):130-137. doi:10.1002/hep.28535 PubMedGoogle ScholarCrossref
18.
Beste  LA, Leipertz  SL, Green  PK, Dominitz  JA, Ross  D, Ioannou  GN.  Trends in burden of cirrhosis and hepatocellular carcinoma by underlying liver disease in US veterans, 2001-2013.   Gastroenterology. 2015;149(6):1471-1482.e5. doi:10.1053/j.gastro.2015.07.056 PubMedGoogle ScholarCrossref
19.
Volk  ML, Tocco  RS, Bazick  J, Rakoski  MO, Lok  AS.  Hospital readmissions among patients with decompensated cirrhosis.   Am J Gastroenterol. 2012;107(2):247-252. doi:10.1038/ajg.2011.314 PubMedGoogle ScholarCrossref
20.
Bruno  S, Saibeni  S, Bagnardi  V,  et al; AISF (Italian Association for the Study of the Liver)–EPA-SCO Collaborative Study Group.  Mortality risk according to different clinical characteristics of first episode of liver decompensation in cirrhotic patients: a nationwide, prospective, 3-year follow-up study in Italy.   Am J Gastroenterol. 2013;108(7):1112-1122. doi:10.1038/ajg.2013.110 PubMedGoogle ScholarCrossref
21.
Jepsen  P, Vilstrup  H, Lash  TL.  Development and validation of a comorbidity scoring system for patients with cirrhosis.   Gastroenterology. 2014;146(1):147-156. doi:10.1053/j.gastro.2013.09.019 PubMedGoogle ScholarCrossref
22.
Singal  AG, Rahimi  RS, Clark  C,  et al.  An automated model using electronic medical record data identifies patients with cirrhosis at high risk for readmission.   Clin Gastroenterol Hepatol. 2013;11(10):1335-1341.e1. doi:10.1016/j.cgh.2013.03.022 PubMedGoogle ScholarCrossref
23.
Stekhoven  DJ, Bühlmann  P.  MissForest–non-parametric missing value imputation for mixed-type data.   Bioinformatics. 2012;28(1):112-118. doi:10.1093/bioinformatics/btr597 PubMedGoogle ScholarCrossref
24.
Shah  AD, Bartlett  JW, Carpenter  J, Nicholas  O, Hemingway  H.  Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study.   Am J Epidemiol. 2014;179(6):764-774. doi:10.1093/aje/kwt312 PubMedGoogle ScholarCrossref
25.
Eleuteri  A, Aung  MSH, Taktak  AFG, Damato  B, Lisboa  PJG. Continuous and Discrete Time Survival Analysis: Neural Network Approaches. Paper presented at: 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society; October 22, 2007; Lyon, France.
26.
Kretowska  M.  Oblique survival trees in discrete event time analysis.   IEEE J Biomed Health Inform. 2020;24(1):247-258. doi:10.1109/JBHI.2019.2908773 PubMedGoogle ScholarCrossref
27.
Singer  JD, Willett  JB.  It’s about time: using discrete-time survival analysis to study duration and the timing of events.   J Educ Stat. 1993;18(2):155-195.Google Scholar
28.
Alexander  BM, Schoenfeld  JD, Trippa  L.  Hazards of hazard ratios: deviations from model assumptions in immunotherapy.   N Engl J Med. 2018;378(12):1158-1159. doi:10.1056/NEJMc1716612 PubMedGoogle ScholarCrossref
29.
Hernán  MA.  The hazards of hazard ratios.   Epidemiology. 2010;21(1):13-15. doi:10.1097/EDE.0b013e3181c1ea43 PubMedGoogle ScholarCrossref
30.
Natekin  A, Knoll  A.  Gradient boosting machines, a tutorial.   Front Neurorobot. 2013;7(21):21.PubMedGoogle Scholar
31.
Chen  T, Guestrin  C. XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; August 13, 2016; San Francisco, CA.
32.
Muthukrishnan  R, Rohini  R. LASSO: a feature selection technique in predictive modeling for machine learning. Paper presented at: 2016 IEEE International Conference on Advances in Computer Applications (ICACA); October 24, 2016; Coimbatore, India.
33.
Miller  GA.  The magical number seven plus or minus two: some limits on our capacity for processing information.   Psychol Rev. 1956;63(2):81-97. doi:10.1037/h0043158 PubMedGoogle ScholarCrossref
34.
Etzioni  R, Pepe  M, Longton  G, Hu  C, Goodman  G.  Incorporating the time dimension in receiver operating characteristic curves: a case study of prostate cancer.   Med Decis Making. 1999;19(3):242-251. doi:10.1177/0272989X9901900303 PubMedGoogle ScholarCrossref
35.
Zheng  Y, Heagerty  PJ.  Semiparametric estimation of time-dependent ROC curves for longitudinal marker data.   Biostatistics. 2004;5(4):615-632. doi:10.1093/biostatistics/kxh013 PubMedGoogle ScholarCrossref
36.
Walsh  CG, Sharman  K, Hripcsak  G.  Beyond discrimination: a comparison of calibration methods and clinical usefulness of predictive models of readmission risk.   J Biomed Inform. 2017;76:9-18. doi:10.1016/j.jbi.2017.10.008 PubMedGoogle ScholarCrossref
37.
Rufibach  K.  Use of Brier score to assess binary predictions.   J Clin Epidemiol. 2010;63(8):938-939. doi:10.1016/j.jclinepi.2009.11.009 PubMedGoogle ScholarCrossref
38.
Zhang  J, Yu  KF.  What’s the relative risk? a method of correcting the odds ratio in cohort studies of common outcomes.   JAMA. 1998;280(19):1690-1691. doi:10.1001/jama.280.19.1690 PubMedGoogle ScholarCrossref
39.
Temel  JS, Greer  JA, Admane  S,  et al.  Longitudinal perceptions of prognosis and goals of therapy in patients with metastatic non–small-cell lung cancer: results of a randomized study of early palliative care.   J Clin Oncol. 2011;29(17):2319-2326. doi:10.1200/JCO.2010.32.4459 PubMedGoogle ScholarCrossref
40.
Chen  JH, Asch  SM.  Machine learning and prediction in medicine: beyond the peak of inflated expectations.   N Engl J Med. 2017;376(26):2507-2509. doi:10.1056/NEJMp1702071 PubMedGoogle ScholarCrossref
41.
Naik  AD, Arney  J, Clark  JA,  et al.  Integrated model for patient-centered advanced liver disease care.   Clin Gastroenterol Hepatol. 2020;18(5):1015-1024. doi:10.1016/j.cgh.2019.07.043 PubMedGoogle ScholarCrossref
42.
Kanwal  F, Tapper  EB, Ho  C,  et al.  Development of quality measures in cirrhosis by the Practice Metrics Committee of the American Association for the Study of Liver Diseases.   Hepatology. 2019;69(4):1787-1797. doi:10.1002/hep.30489 PubMedGoogle ScholarCrossref
Limit 200 characters
Limit 25 characters
Conflicts of Interest Disclosure

Identify all potential conflicts of interest that might be relevant to your comment.

Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.

Err on the side of full disclosure.

If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.

Not all submitted comments are published. Please see our commenting policy for details.

Limit 140 characters
Limit 3600 characters or approximately 600 words
    Original Investigation
    Gastroenterology and Hepatology
    November 3, 2020

    Development, Validation, and Evaluation of a Simple Machine Learning Model to Predict Cirrhosis Mortality

    Author Affiliations
    • 1Section of Gastroenterology and Hepatology, Department of Medicine, Baylor College of Medicine, Houston, Texas
    • 2Health Services Research, Department of Medicine, Baylor College of Medicine, Houston, Texas
    • 3Veterans Affairs (VA) Health Services Research and Development Service Center for Innovations in Quality, Effectiveness, and Safety, Houston, Texas
    • 4Michael E. DeBakey VA Medical Center, Houston, Texas
    • 5Center for Innovation to Implementation, VA Palo Alto Health Care System, Palo Alto, California
    • 6Department of Medicine, VA Boston Healthcare System, Boston University, Boston, Massachusetts
    • 7Department of Health Law, Policy, and Management, VA Boston Healthcare System, Boston University, Boston, Massachusetts
    • 8Section of Geriatrics and Palliative Medicine, Department of Medicine, Baylor College of Medicine, Houston, Texas
    • 9Division of Primary Care and Population Health, Department of Medicine, Stanford University, Stanford, California
    JAMA Netw Open. 2020;3(11):e2023780. doi:10.1001/jamanetworkopen.2020.23780
    Key Points

    Question  Can a blended approach that uses clinical variables selected from machine learning to develop traditional prognostic models improve the accuracy of prediction while addressing challenges related to interpretability?

    Findings  In a prognostic study including a cohort of 107 939 patients with cirrhosis, simple machine learning techniques performed as well as the more advanced ensemble gradient boosting techniques. Using the clinical variables identified from simple machine learning in a cirrhosis mortality model produced a new score more predictive than the traditional Model for End Stage Liver Disease with sodium.

    Meaning  These findings suggest that this blended approach can improve data-driven risk prognostication through the development of new scores that are both more transparent and actionable than machine learning and more predictive than traditional risk scores.

    Abstract

    Importance  Machine-learning algorithms offer better predictive accuracy than traditional prognostic models but are too complex and opaque for clinical use.

    Objective  To compare different machine learning methods in predicting overall mortality in cirrhosis and to use machine learning to select easily scored clinical variables for a novel cirrhosis prognostic model.

    Design, Setting, and Participants  This prognostic study used a retrospective cohort of adult patients with cirrhosis or its complications seen in 130 hospitals and affiliated ambulatory clinics in the integrated, national Veterans Affairs health care system from October 1, 2011, to September 30, 2015. Patients were followed up through December 31, 2018. Data were analyzed from October 1, 2017, to May 31, 2020.

    Exposures  Potential predictors included demographic characteristics; liver disease etiology, severity, and complications; use of health care resources; comorbid conditions; and comprehensive laboratory and medication data. Patients were randomly selected for model development (66.7%) and validation (33.3%). Three different statistical and machine learning methods were evaluated: gradient descent boosting, logistic regression with least absolute shrinkage and selection operator (LASSO) regularization, and logistic regression with LASSO constrained to select no more than 10 predictors (partial pathway model). Predictor inclusion and model performance were evaluated in a 5-fold cross-validation. Last, the predictors identified in the most parsimonious (the partial path) model were refit using maximum-likelihood estimation (Cirrhosis Mortality Model [CiMM]), and its predictive performance was compared with that of the widely used Model for End Stage Liver Disease with sodium (MELD-Na) score.

    Main Outcomes and Measures  All-cause mortality.

    Results  Of the 107 939 patients with cirrhosis (mean [SD] age, 62.7 [9.6] years; 96.6% male; 66.3% white, 18.4% African American), the annual mortality rate ranged from 8.8% to 15.3%. In total, 32.7% of patients died within 3 years, and 46.2% died within 5 years after the index date. Models predicting 1-year mortality had good discrimination for the gradient descent boosting (area under the receiver operating characteristics curve [AUC], 0.81; 95% CI, 0.80-0.82), logistic regression with LASSO regularization (AUC, 0.78; 95% CI, 0.77-0.79), and the partial path logistic model (AUC, 0.78; 95% CI, 0.76-0.78). All models showed good calibration. The final CiMM model with machine learning–derived clinical variables offered significantly better discrimination than the MELD-Na score, with AUCs of 0.78 (95% CI, 0.77-0.79) vs 0.67 (95% CI, 0.66-0.68) for 1-year mortality, respectively (DeLong z = 17.00; P < .001).

    Conclusions and Relevance  In this study, simple machine learning techniques performed as well as the more advanced ensemble gradient boosting. Using the clinical variables identified from simple machine learning in a cirrhosis mortality model produced a new score more transparent than machine learning and more predictive than the MELD-Na score.

    Introduction

    Risk stratification is at the core of medical practice. Risk prediction scores now routinely guide treatment across a range of medical decisions from anticoagulation1,2 to lowering of cholesterol levels3 to life-sustaining intensive care.4 The most widely used scores include a limited number of easily measured variables, allowing for transparent calculation and interpretability but constraining their prognostic performance.

    Machine learning techniques have the potential to improve prognostication. These techniques incorporate a large array of predictors in a nonlinear pattern and use multiple interactions to enhance accuracy. However, their many variables and the complexity of scoring rules hinder their implementation in all but the most advanced informatics settings. Even when informatics infrastructure supports them, the “black box” nature of the algorithms means they are inherently unexplainable to clinicians and patients.5 A blended strategy that builds on the strengths of machine learning to develop simpler, clinically explainable risk scores may optimize the trade-off between accuracy vs interpretability and also facilitate subsequent implementation.

    Cirrhosis is a high-risk common condition with a progressive clinical course. The most widely used cirrhosis prognostic models such as the Model for End Stage Liver Disease with sodium (MELD-Na) or Child-Turcotte-Pugh,6,7 are disease-specific scores but may have modest discriminative ability for overall mortality. Other prognostic scores in cirrhosis are mostly applicable in restricted settings, such as scores predicting short-term risk of dying during or after hospitalization.8,9 None of these scores account for a wide range of clinical and psychosocial factors that are likely to be associated with mortality in cirrhosis. Machine learning techniques have been used to help fill these gaps for cirrhosis but have not seen widespread use.10

    We performed a prognostic study using data from a retrospective cohort of patients with cirrhosis seen at 130 US Department of Veterans Affairs (VA) hospitals in the United States. We first developed and compared 3 machine learning algorithms with varying levels of complexity and range of variables that predicted risk of mortality in cirrhosis. To achieve a balance among accuracy, interpretability, and feasibility, we then developed and validated a blended model (Cirrhosis Mortality Model [CiMM]) that used the variables selected from machine learning algorithms and implemented them in an accessible platform. Last, we compared the CiMM with the MELD-Na score to determine whether it meaningfully improved mortality risk predictions in cirrhosis over this widely used model.

    Methods
    Data Sources

    We used data from the national VA Corporate Data Warehouse that includes all laboratory test results, pharmacy, inpatient and outpatient procedures, and diagnosis codes for patients using the VA for health care. We also used the VA Purchased Care database of services paid by but rendered outside the VA. We obtained date of death from the VA Vital Status file. This study was approved by the institutional review board of Baylor College of Medicine, Houston, Texas, which waived the need for informed consent, and followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guideline.

    Study Cohort

    Our cohort included patients with cirrhosis who were seen in ambulatory clinics at 130 VA hospitals from October 1, 2011, to September 30, 2015. We included patients if they had at least 2 instances of cirrhosis or cirrhosis complications codes or at least 1 code for cirrhosis or complications with at least 1 filled prescription of spironolactone (≥100 mg for ascites), rifaximin, or lactulose (for encephalopathy) after a cirrhosis diagnosis. These case ascertainment strategies were found to have high positive predictive value (86%-93%) for the presence of cirrhosis in the patients’ medical records.7,11,12

    We selected the first clinic visit at or after meeting cohort entry criteria as the index date for follow-up (eMethods 1 in the Supplement). We excluded patients younger than 18 or older than 90 years or who received a liver transplant before the index date. We acquired data through December 31, 2018, to ascertain end points.

    Variable Selection

    For the dependent variable, we obtained all-cause mortality data from VA Vital Status File that combines information from the VA Death File, VA Compensation and Pension Benefits, Medicare, and Social Security and has a sensitivity of 98.3% and specificity of 99.8% relative to the National Death Index.13 We based the selection of predictor variables on a priori hypotheses guided by literature and clinical knowledge as well as easy availability in the electronic health record. Sociodemographic variables included age, sex, race/ethnicity, rural status, marital status, means test (financial assessment), and enrollment priority level.14 eTable 1 in the Supplement gives the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM), Current Procedural Terminology, and drug class codes used to define predictor variables.

    We defined hepatitis C virus (HCV) based on any evidence of positive HCV ribonucleic acid test result,15 hepatitis B virus (HBV) based on any positive finding for hepatitis B surface antigen,16 and alcohol-related liver disease based on at least 1 ICD-9-CM code for alcohol use disorders at any time or positive 3-Item Alcohol Use Disorders Identification Test scores (≥4 in men and ≥3 in women) within 1 year before the index date. For HCV, we also determined whether patients had achieved sustained virologic response.17 We identified nonalcoholic steatohepatitis as the possible etiology of cirrhosis for patients without any other cause who had type 2 diabetes or body mass index (calculated as weight in kilograms divided by height in meters squared) of greater than 30 before diagnosis of cirrhosis.18

    We extracted data for serum levels of bilirubin, sodium, and creatinine and international normalized ratio performed within 1 year before and closest to the index date. We also combined them to derive MELD-Na scores.12 Other liver disease factors included serum levels of albumin, hemoglobin, alanine aminotransferase (ALT), and aspartate aminotransferase (AST), AST:ALT ratio, and platelet counts within 1 year before the index date. We defined type and number of cirrhosis complications and infections.19,20

    At least 40% of patients with cirrhosis may have comorbidities that increase mortality.21,22 We defined medical conditions using the cirrhosis-specific comorbidity (CirCom) score,21 which includes chronic obstructive pulmonary disease, history of myocardial infarction, peripheral artery disease, epilepsy, drug abuse, heart failure, nonmetastatic or hematological cancer, metastatic cancer, and chronic kidney disease. Other health conditions included diabetes, history of infection, depression, anxiety, and alcohol use. Table 1 and eTable 1 in the Supplement describe the codes used for CirCom scores and other conditions.

    We used outpatient prescription files to identify medication classes selected based on frequency of use among our cohort or known associations with cirrhosis outcomes. We also extracted information on other therapies for cirrhosis complications, such as endoscopic variceal ligation, paracentesis, and transjugular intrahepatic portosystemic shunts in the year before the index date.

    We included the most current values of pulse, blood pressure, respiratory rate, and body mass index recorded within 1 year before and closest to the index date.14 We also included data on history of smoking and the most current levels of total, low-density lipoprotein, and high-density lipoprotein cholesterol because they are associated with all-cause mortality in the general population. Last, we included history of hospitalization (both liver and all-cause hospitalization), number of outpatient visits, and whether the patient sought emergency care in the year before or any time before the index visit.

    Statistical Analysis

    Data were analyzed from October 1, 2017, to May 31, 2020. Few laboratory values were missing in more than 5% of patients. We imputed data using a machine learning–based imputation, MissForest (eMethods 2 in the Supplement).23,24 We structured the data to evaluate risk of mortality at yearly time horizons using discrete time-to-event methods.25-27 Discrete time-to-event methods avoid issues with the proportional hazards assumption28,29 and use person-periods to allow for nonproportional changes in risk of mortality across time within each patient.27 We used this approach because several candidate predictors violated the proportional hazards assumption. Patients could have 1 to 8 years of follow-up before death or censoring.

    Prediction Models

    We developed and compared 3 models. First, we used extreme gradient descent boosting,30,31 which accounts for higher-order, nonlinear interactions in a variety of data types, including binary and continuous variables, and selects variables while training. Second, we used logistic regression with least absolute shrinkage and selection operator (LASSO) regularization, which is a technique that alters the model fitting to select only a subset of predictors based on gradient descent instead of relying on P values (eMethods 3 in the Supplement).32 The first 2 methods used the full set of predictors (ie, full models). In the third step, we constrained the logistic regression model to the top features set, including a maximum of 10 individual factors (partial path model),5 because most people can interpret, at most, 7 to 10 entities at a time.33

    Derivation and Validation

    We split the cohort into a derivation (66.7% of the data) and a holdout validation set (33.3% of the data); the same holdout set was used across all models fit. The derivation set was divided randomly into 5 equal subsets, preserving the same event rate in each subset. We combined 4 of the subsets for derivation and reserved the remaining subset as internal validation.

    Subsequent to training the models, we estimated discrimination via the area under the receiver operating characteristics curve (AUC) in the validation set. We computed AUCs specific to each time horizon (discrimination).34,35 We evaluated calibration36,37 with Brier scores and with visual assessments of calibration curves for each time horizon. Brier scores can be interpreted as how far the prediction is from the observed estimate37; lower scores indicate better calibration.

    To allow easy applicability, we refitted the predictors included in the best-performing model using maximum-likelihood discrete time-to-event logistic regression estimation (CiMM). We converted the odds ratios to relative risk (RR) ratios (2-sided P < .05 indicated statistical significance).38 In the last step, we compared the performance of the CiMM with that of the MELD-Na score.

    Risk Scoring and Risk Stratification

    To show how the CiMM may be applied for risk stratification, we selected a range of clinically plausible values on all predictors and applied the CiMM to these plausible patient profiles. We retained the predicted risk on a probability scale between no risk (0.00) and near complete risk (>0.99).

    Results
    Demographic Characteristics

    We identified 107 939 patients with cirrhosis (Table 1). The mean (SD) age of patients was 62.7 (9.6) years (96.6% male and 3.4% female); 66.3% were White; 18.4% were Black; and 42.4% were married. Most patients (68.9%) had HCV- or alcohol-related cirrhosis; 26.9% had nonalcoholic steatohepatitis cirrhosis. Approximately one-third (33.1%) had a MELD-Na score of 10 or higher; 20.2% had ascites, and 20.0% had hepatic encephalopathy. Participants had high rates of history of drug or alcohol abuse (41.4%), chronic obstructive pulmonary disease (16.1%), and heart failure (10.5%) in the past year. In total, 40.9% of the cohort was hospitalized for any cause in the year before the index date. eTable 2 in the Supplement shows the complete set of variables used in the full models.

    Figure 1 displays the annual and cumulative incidence of all-cause mortality. The annual mortality rate ranged from 8.8% to 15.3%. In total, 32.7% of patients died within 3 years, and 46.2% died within 5 years after the index date.

    Prediction Model Performance

    Model discrimination was high across the 3 approaches (Table 2). For the full gradient boosting model, the AUC for predicting 1-year mortality was 0.81 (95% CI, 0.80-0.82). For the full discrete time-to-event logistic model with LASSO, the AUC for 1-year mortality was slightly lower at 0.78 (95% CI, 0.77-0.79). Constraining the logistic regression model to include the subset of important variables (partial path logistic) resulted in AUCs that were similar (0.78; 95% CI, 0.76-0.78) to the full logistic model with LASSO. Overall, mortality predictions for year 2 onward showed similar trends, although the overall discrimination fell as time elapsed (years 2-8 AUCs: full gradient boosting range, 0.78-0.72; full discrete logistic regression range, 0.76-0.69; partial discrete logistic regression range, 0.76-0.67).

    The Brier scores ranged from 0.07 (95% CI, 0.07-0.07) to 0.11 (95% CI, 0.11-0.11) using gradient boosting, with the full and partial path discrete time-to-event logistic regression models all indicating good calibration (Table 2). eFigures 1 and 2 in the Supplement show the discrimination and calibration slopes for the gradient boosting, full logistic, and partial path models.

    Prediction Model Interpretation

    Given similar performance characteristics across different models and based on a priori considerations, we retained the partial path discrete time-to-event logistic regression model because it can be interpreted and feasibly implemented in different clinical settings. We modeled the selected predictors using maximum-likelihood discrete time-to-event logistic regression to develop the CiMM.

    Figure 2 shows the visual range of coefficients in the CiMM. Older age (RR ratio per 1-year increase, 1.04; 95% CI, 1.03-1.04), higher bilirubin level (RR ratio per additional unit, 1.05; 95% CI, 1.04-1.05), an AST:ALT ratio greater than 2 (RR ratio, 1.26; 95% CI, 1.24-1.29), hepatic encephalopathy (RR ratio, 1.20; 95% CI, 1.18-1.22), ascites (RR ratio, 1.30; 95% CI, 1.28-1.33), and hepatocellular carcinoma (RR ratio, 2.18; 95% CI, 2.13-2.23) were significantly associated with higher risk of mortality. Compared with patients with a CirCom score of 0 (no comorbidity), the RR ratio ranged from 1.27 (95% CI, 1.24-1.30) for patients with CirCom score of 1 + 0 (≥1 of the following: chronic obstructive pulmonary disease, peripheral artery disease, epilepsy, drug abuse, or heart failure) to 2.84 (95% CI, 2.74-2.94) for patients with CirCom score of 5 + 1 (active metastatic cancer and 1 more comorbidity).

    Black participants had lower risk (RR ratio, 0.87; 95% CI, 0.85-0.89) than patients in other racial groups. Greater connection to health care (RR ratio, 0.77; 95% CI, 0.75-0.79) was associated with lower mortality. Higher albumin level (RR ratio per additional unit, 0.63; 95% CI, 0.62-0.64), higher hemoglobin level (RR ratio per additional unit, 0.94; 95% CI, 0.94-0.94), higher sodium level (RR ratio per additional unit, 0.98; 95% CI, 0.98-0.99), and higher platelet count (RR ratio per additional 50 units, 0.95; 95% CI, 0.94-0.95) decreased risk for mortality.

    Comparative Analysis of CiMM and MELD-Na

    Table 3 compares the CiMM’s performance with that of the MELD-Na score. The AUCs with the CiMM were 0.78 (95% CI, 0.77-0.79) for 1-year mortality, 0.76 (95% CI, 0.75-0.77) for 2-year mortality, and 0.72 (95% CI, 0.71-0.73) for 3-year mortality. The corresponding AUCs for MELD-Na were 0.67 (95% CI, 0.66-0.68) for 1-year mortality, 0.65 (95% CI, 0.64-0.66) for 2-year mortality, and 0.61 (95% CI, 0.60-0.62) for 3-year mortality; P < .001 for each discrete year and remained so after false discovery rate adjustment (DeLong z = 17.00). The Brier scores for the MELD-Na predictions ranged from 0.08 (95% CI, 0.07-0.08) to 0.13 (95% CI, 0.12-0.13). Across all ranges of false-positive rates (AUC), the mean sensitivity of the CiMM was 10% to 11% higher than that for the MELD-Na score for all years.

    Risk Scoring and Risk Stratification

    eTable 3 in the Supplement shows the predicted risk of death across different patient profiles. For example, a 55-year-old patient with a serum bilirubin level of 2 mg/dL (to convert to μmol/L, multiply by 17.104), platelet count of 150 × 103/μL, albumin and hemoglobin values within the reference range, ascites but no other cirrhosis complications, and a history of myocardial infarction (CirCom 1) has a 12% risk of death within 1 year. In contrast, a 65-year-old patient with a bilirubin level of 4 mg/dL, platelet count 150 × 103/μL, albumin level of 2.5 g/dL (to convert to g/L, multiply by 10.0), hemoglobin level of 10.5 g/dL (to convert to g/L, multiply by 10.0), ascites, chronic kidney disease, and a hematological cancer (CirCom 3 + 1) has a 66% risk of death within 1 year. eTable 4 in the Supplement shows the scoring intercepts and β coefficients that can be used to predict mortality in different patients with cirrhosis. An interactive CiMM scoring application (http://cimm.herokuapp.com/main) is available.

    Discussion

    Better understanding of prognosis can frame patients’ preferences, help prioritize goals of care,39 and inform decision-making across many medical conditions. Machine learning models have greatly enhanced the accuracy of such predictions, but their black box analytics have limited their usefulness. The most useful models would combine the high predictive accuracy with the transparency and easy measurability of more traditional risk scores. Using cirrhosis as a test case, we found that a simple machine learning method enabled us to select electronic health record variables for a new blended CiMM. The CiMM had similar accuracy as less interpretable machine learning algorithms yet higher accuracy than the traditional MELD-Na severity score. The CiMM was more transparent than the machine learning models that helped select the variables and could be more easily applied in a variety of point-of-care clinical informatics infrastructures. Although we tested this approach in cirrhosis, it holds promise for improving prognostication across other medical conditions.

    Multiple risk scores allow clinicians to estimate the risk of mortality in cirrhosis. The MELD-Na score is one of the most commonly used. Although originally developed to predict 90-day mortality, both primary care clinicians and specialists use it for general prognostication as well. The MELD-Na score had modest discriminative ability (AUC, 0.67) in our analysis. Our data show that relying on the MELD-Na score for prognostication beyond 90 days may not be ideal. Indeed, the 90-day window may be too short for most patients with cirrhosis, except for those on the liver transplant waiting list. For a risk prediction model to be clinically meaningful, it should predict events early enough to influence decisions and outcomes.40 The CiMM was able to accurately identify individuals at high risk of mortality at 1, 2, and 3 years (AUCs, 0.78, 0.76, and 0.71, respectively).

    The variables selected by simple machine learning were consistent with those in the published literature, providing convergent validity to our results. However, some of these variables might not have been prioritized in alternative model-building strategies. One such example is the comorbidity (CirCom) score, which emerged as a strong predictor of mortality in this cohort of patients with advanced liver disease. These variables, including comorbidity, are easily available in electronic health records, rendering it feasible to implement the model at the patient and population levels. The CiMM can be incorporated into electronic health records from EPIC, Cerner Corporation, and others to easily automate and display prediction scores at the individual patient level. The CiMM can also be incorporated into population dashboards as part of quality improvement strategies. Within these population management systems, CiMM could identify high-risk patients for linkage to care, vigilant surveillance, and proactive care coordination. Matching the interventions with patients’ risk of mortality may allow more tailored approaches rather than 1-size-fits-all strategies,41 enhancing the overall effectiveness of quality improvement initiatives.

    Limitations

    Our study has several limitations. We could not adjust for some factors, such as patients’ compliance with treatment, because these variables are difficult to operationalize from electronic health records. Some data were missing, but the rate of missingness was low, and data, when present, were imputed with the predicted values of a series of random-forests tree ensembles.24 Our study is limited to patients seen in the VA, most of whom were older men. Inclusion of a large racially and geographically diverse patient population may enhance generalizability, although future studies are needed to examine the external validity of CiMM in women and nonveterans before widespread deployment. The CiMM estimates prognosis in patients with cirrhosis and not the prognosis of cirrhosis per se. However, all-cause mortality is important in and of itself and represents an outcome that is most meaningful to patients with cirrhosis.42

    Conclusions

    The promise of machine learning–based medical prognostication has been limited by implementation and interpretation challenges. We found that machine learning can help select important variables for more transparent risk scores while maintaining high rates of accuracy. The resultant blended CiMM performed better than the widely used MELD-Na score. If confirmed in other conditions, this blended approach could improve data-driven risk prognostication through the development of new scores that are more transparent and more actionable than machine learning and more predictive than traditional risk scores.

    Back to top
    Article Information

    Accepted for Publication: September 1, 2020.

    Published: November 3, 2020. doi:10.1001/jamanetworkopen.2020.23780

    Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2020 Kanwal F et al. JAMA Network Open.

    Corresponding Author: Fasiha Kanwal, MD, MSHS, Department of Medicine, 7200 Cambridge St, BCM 901, McNair Building, Ste A10-198, Houston, TX 77030 (kanwal@bcm.edu).

    Author Contributions: Dr Kanwal had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

    Concept and design: Kanwal, Taylor, Kramer, El-Serag, Naik, Asch.

    Acquisition, analysis, or interpretation of data: Kanwal, Taylor, Kramer, Cao, Smith, Gifford, El-Serag, Asch.

    Drafting of the manuscript: Kanwal, Taylor, Cao.

    Critical revision of the manuscript for important intellectual content: Kanwal, Taylor, Kramer, Smith, Gifford, El-Serag, Naik, Asch.

    Statistical analysis: Taylor, Cao, Asch.

    Obtained funding: Kanwal, Taylor, Smith, Gifford, Naik.

    Administrative, technical, or material support: Kanwal, Taylor, Smith, Gifford.

    Supervision: Kanwal, Kramer, Naik, Asch.

    Conflict of Interest Disclosures: Dr Kanwal reported receiving grants from Veterans Health Administration during the conduct of the study. Dr Kramer reported receiving grants from US Department of Veterans Affairs (VA) during the conduct of the study. Dr Naik reported receiving grants from the VA during the conduct of the study. No other disclosures were reported.

    Funding/Support: This material is based on work supported by Investigator Initiated Research Award I01 HX002204-01 from the US VA Health Services Research and Development Service. The study was also supported in part by contract CIN 13-413 from the Veterans Administration Center for Innovations in Quality, Effectiveness and Safety.

    Role of the Funder/Sponsor: The sponsors had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

    References
    1.
    Lip  GY, Nieuwlaat  R, Pisters  R, Lane  DA, Crijns  HJ.  Refining clinical risk stratification for predicting stroke and thromboembolism in atrial fibrillation using a novel risk factor-based approach: the Euro Heart Survey on Atrial Fibrillation.   Chest. 2010;137(2):263-272. doi:10.1378/chest.09-1584 PubMedGoogle ScholarCrossref
    2.
    Friberg  L, Rosenqvist  M, Lip  GY.  Evaluation of risk stratification schemes for ischaemic stroke and bleeding in 182 678 patients with atrial fibrillation: the Swedish Atrial Fibrillation cohort study.   Eur Heart J. 2012;33(12):1500-1510. doi:10.1093/eurheartj/ehr488 PubMedGoogle ScholarCrossref
    3.
    American Heart Association. Heart risk calculator. Published 2013. Accessed June 2, 2020. http://www.cvriskcalculator.com/
    4.
    Capuzzo  M, Valpondi  V, Sgarbi  A,  et al.  Validation of severity scoring systems SAPS II and APACHE II in a single-center population.   Intensive Care Med. 2000;26(12):1779-1785. doi:10.1007/s001340000715 PubMedGoogle ScholarCrossref
    5.
    Rudin  C.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.   Nature Machine Intelligence. 2019;1(5):206-215. doi:10.1038/s42256-019-0048-x Google ScholarCrossref
    6.
    Kamath  PS, Wiesner  RH, Malinchoc  M,  et al.  A model to predict survival in patients with end-stage liver disease.   Hepatology. 2001;33(2):464-470. doi:10.1053/jhep.2001.22172 PubMedGoogle ScholarCrossref
    7.
    Kaplan  DE, Dai  F, Aytaman  A,  et al. Development and performance of an algorithm to estimate the Child-Turcotte-Pugh Score from a national electronic healthcare database.  Clin Gastroenterol Hepatol. 2015;13(13):2333-2341.e6. doi:10.1016/j.cgh.2015.07.010
    8.
    Sarmast  N, Ogola  GO, Kouznetsova  M,  et al.  Model for end-stage liver disease-lactate and prediction of inpatient mortality in patients with chronic liver disease.   Hepatology. 2020. doi:10.1002/hep.31199 PubMedGoogle Scholar
    9.
    Mahmud  N, Hubbard  RA, Kaplan  DE, Taddei  TH, Goldberg  DS.  Risk prediction scores for acute on chronic liver failure development and mortality.   Liver Int. 2020;40(5):1159-1167. doi:10.1111/liv.14328 PubMedGoogle ScholarCrossref
    10.
    Koola  JD, Ho  S, Chen  G,  et al.  Development of a national Department of Veterans Affairs mortality risk prediction model among patients with cirrhosis.   BMJ Open Gastroenterol. 2019;6(1):e000342. doi:10.1136/bmjgast-2019-000342 PubMedGoogle Scholar
    11.
    Buchanan  PM, Kramer  JR, El-Serag  HB,  et al.  The quality of care provided to patients with varices in the department of Veterans Affairs.   Am J Gastroenterol. 2014;109(7):934-940. doi:10.1038/ajg.2013.487 PubMedGoogle ScholarCrossref
    12.
    Kanwal  F, Kramer  JR, Buchanan  P,  et al.  The quality of care provided to patients with cirrhosis and ascites in the Department of Veterans Affairs.   Gastroenterology. 2012;143(1):70-77. doi:10.1053/j.gastro.2012.03.038 PubMedGoogle ScholarCrossref
    13.
    Sohn  MW, Arnold  N, Maynard  C, Hynes  DM.  Accuracy and completeness of mortality data in the Department of Veterans Affairs.   Popul Health Metr. 2006;4:2. doi:10.1186/1478-7954-4-2 PubMedGoogle ScholarCrossref
    14.
    Wang  L, Porter  B, Maynard  C,  et al.  Predicting risk of hospitalization or death among patients receiving primary care in the Veterans Health Administration.   Med Care. 2013;51(4):368-373. doi:10.1097/MLR.0b013e31827da95a PubMedGoogle ScholarCrossref
    15.
    Kanwal  F, Kramer  JR, Ilyas  J, Duan  Z, El-Serag  HB.  HCV genotype 3 is associated with an increased risk of cirrhosis and hepatocellular cancer in a national sample of US veterans with HCV.   Hepatology. 2014;60(1):98-105. doi:10.1002/hep.27095 PubMedGoogle ScholarCrossref
    16.
    Kruse  RL, Kramer  JR, Tyson  GL,  et al.  Clinical outcomes of hepatitis B virus coinfection in a United States cohort of hepatitis C virus–infected patients.   Hepatology. 2014;60(6):1871-1878. doi:10.1002/hep.27337 PubMedGoogle ScholarCrossref
    17.
    El-Serag  HB, Kanwal  F, Richardson  P, Kramer  J.  Risk of hepatocellular carcinoma after sustained virological response in veterans with hepatitis C virus infection.   Hepatology. 2016;64(1):130-137. doi:10.1002/hep.28535 PubMedGoogle ScholarCrossref
    18.
    Beste  LA, Leipertz  SL, Green  PK, Dominitz  JA, Ross  D, Ioannou  GN.  Trends in burden of cirrhosis and hepatocellular carcinoma by underlying liver disease in US veterans, 2001-2013.   Gastroenterology. 2015;149(6):1471-1482.e5. doi:10.1053/j.gastro.2015.07.056 PubMedGoogle ScholarCrossref
    19.
    Volk  ML, Tocco  RS, Bazick  J, Rakoski  MO, Lok  AS.  Hospital readmissions among patients with decompensated cirrhosis.   Am J Gastroenterol. 2012;107(2):247-252. doi:10.1038/ajg.2011.314 PubMedGoogle ScholarCrossref
    20.
    Bruno  S, Saibeni  S, Bagnardi  V,  et al; AISF (Italian Association for the Study of the Liver)–EPA-SCO Collaborative Study Group.  Mortality risk according to different clinical characteristics of first episode of liver decompensation in cirrhotic patients: a nationwide, prospective, 3-year follow-up study in Italy.   Am J Gastroenterol. 2013;108(7):1112-1122. doi:10.1038/ajg.2013.110 PubMedGoogle ScholarCrossref
    21.
    Jepsen  P, Vilstrup  H, Lash  TL.  Development and validation of a comorbidity scoring system for patients with cirrhosis.   Gastroenterology. 2014;146(1):147-156. doi:10.1053/j.gastro.2013.09.019 PubMedGoogle ScholarCrossref
    22.
    Singal  AG, Rahimi  RS, Clark  C,  et al.  An automated model using electronic medical record data identifies patients with cirrhosis at high risk for readmission.   Clin Gastroenterol Hepatol. 2013;11(10):1335-1341.e1. doi:10.1016/j.cgh.2013.03.022 PubMedGoogle ScholarCrossref
    23.
    Stekhoven  DJ, Bühlmann  P.  MissForest–non-parametric missing value imputation for mixed-type data.   Bioinformatics. 2012;28(1):112-118. doi:10.1093/bioinformatics/btr597 PubMedGoogle ScholarCrossref
    24.
    Shah  AD, Bartlett  JW, Carpenter  J, Nicholas  O, Hemingway  H.  Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study.   Am J Epidemiol. 2014;179(6):764-774. doi:10.1093/aje/kwt312 PubMedGoogle ScholarCrossref
    25.
    Eleuteri  A, Aung  MSH, Taktak  AFG, Damato  B, Lisboa  PJG. Continuous and Discrete Time Survival Analysis: Neural Network Approaches. Paper presented at: 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society; October 22, 2007; Lyon, France.
    26.
    Kretowska  M.  Oblique survival trees in discrete event time analysis.   IEEE J Biomed Health Inform. 2020;24(1):247-258. doi:10.1109/JBHI.2019.2908773 PubMedGoogle ScholarCrossref
    27.
    Singer  JD, Willett  JB.  It’s about time: using discrete-time survival analysis to study duration and the timing of events.   J Educ Stat. 1993;18(2):155-195.Google Scholar
    28.
    Alexander  BM, Schoenfeld  JD, Trippa  L.  Hazards of hazard ratios: deviations from model assumptions in immunotherapy.   N Engl J Med. 2018;378(12):1158-1159. doi:10.1056/NEJMc1716612 PubMedGoogle ScholarCrossref
    29.
    Hernán  MA.  The hazards of hazard ratios.   Epidemiology. 2010;21(1):13-15. doi:10.1097/EDE.0b013e3181c1ea43 PubMedGoogle ScholarCrossref
    30.
    Natekin  A, Knoll  A.  Gradient boosting machines, a tutorial.   Front Neurorobot. 2013;7(21):21.PubMedGoogle Scholar
    31.
    Chen  T, Guestrin  C. XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; August 13, 2016; San Francisco, CA.
    32.
    Muthukrishnan  R, Rohini  R. LASSO: a feature selection technique in predictive modeling for machine learning. Paper presented at: 2016 IEEE International Conference on Advances in Computer Applications (ICACA); October 24, 2016; Coimbatore, India.
    33.
    Miller  GA.  The magical number seven plus or minus two: some limits on our capacity for processing information.   Psychol Rev. 1956;63(2):81-97. doi:10.1037/h0043158 PubMedGoogle ScholarCrossref
    34.
    Etzioni  R, Pepe  M, Longton  G, Hu  C, Goodman  G.  Incorporating the time dimension in receiver operating characteristic curves: a case study of prostate cancer.   Med Decis Making. 1999;19(3):242-251. doi:10.1177/0272989X9901900303 PubMedGoogle ScholarCrossref
    35.
    Zheng  Y, Heagerty  PJ.  Semiparametric estimation of time-dependent ROC curves for longitudinal marker data.   Biostatistics. 2004;5(4):615-632. doi:10.1093/biostatistics/kxh013 PubMedGoogle ScholarCrossref
    36.
    Walsh  CG, Sharman  K, Hripcsak  G.  Beyond discrimination: a comparison of calibration methods and clinical usefulness of predictive models of readmission risk.   J Biomed Inform. 2017;76:9-18. doi:10.1016/j.jbi.2017.10.008 PubMedGoogle ScholarCrossref
    37.
    Rufibach  K.  Use of Brier score to assess binary predictions.   J Clin Epidemiol. 2010;63(8):938-939. doi:10.1016/j.jclinepi.2009.11.009 PubMedGoogle ScholarCrossref
    38.
    Zhang  J, Yu  KF.  What’s the relative risk? a method of correcting the odds ratio in cohort studies of common outcomes.   JAMA. 1998;280(19):1690-1691. doi:10.1001/jama.280.19.1690 PubMedGoogle ScholarCrossref
    39.
    Temel  JS, Greer  JA, Admane  S,  et al.  Longitudinal perceptions of prognosis and goals of therapy in patients with metastatic non–small-cell lung cancer: results of a randomized study of early palliative care.   J Clin Oncol. 2011;29(17):2319-2326. doi:10.1200/JCO.2010.32.4459 PubMedGoogle ScholarCrossref
    40.
    Chen  JH, Asch  SM.  Machine learning and prediction in medicine: beyond the peak of inflated expectations.   N Engl J Med. 2017;376(26):2507-2509. doi:10.1056/NEJMp1702071 PubMedGoogle ScholarCrossref
    41.
    Naik  AD, Arney  J, Clark  JA,  et al.  Integrated model for patient-centered advanced liver disease care.   Clin Gastroenterol Hepatol. 2020;18(5):1015-1024. doi:10.1016/j.cgh.2019.07.043 PubMedGoogle ScholarCrossref
    42.
    Kanwal  F, Tapper  EB, Ho  C,  et al.  Development of quality measures in cirrhosis by the Practice Metrics Committee of the American Association for the Study of Liver Diseases.   Hepatology. 2019;69(4):1787-1797. doi:10.1002/hep.30489 PubMedGoogle ScholarCrossref
    ×