[Skip to Navigation]
Sign In
Figure.  Comparison of the Performance of 3 Modeling Approaches Using Prehospitalization, Preoperative, and Perioperative Data for Acute Kidney Injury
Comparison of the Performance of 3 Modeling Approaches Using Prehospitalization, Preoperative, and Perioperative Data for Acute Kidney Injury

Logistic regression with elastic net selection (A), random forest (B), and gradient boosting machine (C) methods used for modeling. The cyan line is the model containing prehospitalization variables. The orange line is the model using preoperative variables (including prehospitalization variables). The navy line is the model using perioperative data (including preoperative and prehospitalization variables). Receiver operating characteristic curves (AUCs) for each model using prehospitalization, preoperative, and perioperative variable groups are shown in the test set. The AUC or C-statistic is calculated along with 95% CIs. The DeLong et al28 test indicates a significant difference between model AUCs (P < .001).

Table 1.  Patient Characteristics in the Model Derivation, Validation, and Test Setsa
Patient Characteristics in the Model Derivation, Validation, and Test Setsa
Table 2.  Clinical Outcomes in the Model Derivation, Validation, and Test Setsa
Clinical Outcomes in the Model Derivation, Validation, and Test Setsa
Table 3.  Acute Kidney Injury Risk as Predicted by Models That Add and Do Not Add Intraoperative Data in Test Data Seta
Acute Kidney Injury Risk as Predicted by Models That Add and Do Not Add Intraoperative Data in Test Data Seta
Table 4.  Acute Kidney Injury Risk Stratification in Test Data Set and Rates of Clinical Outcomes by Variable Groupa
Acute Kidney Injury Risk Stratification in Test Data Set and Rates of Clinical Outcomes by Variable Groupa
1.
Grams  ME, Sang  Y, Coresh  J,  et al.  Acute kidney injury after major surgery: a retrospective analysis of Veterans Health Administration data.  Am J Kidney Dis. 2016;67(6):872-880. doi:10.1053/j.ajkd.2015.07.022PubMedGoogle ScholarCrossref
2.
Cho  E, Kim  SC, Kim  MG, Jo  SK, Cho  WY, Kim  HK.  The incidence and risk factors of acute kidney injury after hepatobiliary surgery: a prospective observational study.  BMC Nephrol. 2014;15:169. doi:10.1186/1471-2369-15-169PubMedGoogle ScholarCrossref
3.
Saran  R, Robinson  B, Abbott  KC,  et al.  US Renal Data System 2017 annual data report: epidemiology of kidney disease in the United States.  Am J Kidney Dis. 2018;71(3)(suppl 1):A7. doi:10.1053/j.ajkd.2018.01.002PubMedGoogle ScholarCrossref
4.
Biteker  M, Dayan  A, Tekkeşin  AI,  et al.  Incidence, risk factors, and outcomes of perioperative acute kidney injury in noncardiac and nonvascular surgery.  Am J Surg. 2014;207(1):53-59. doi:10.1016/j.amjsurg.2013.04.006PubMedGoogle ScholarCrossref
5.
National Hospital Discharge Survey: 2010 table, Procedures by selected patient characteristics. https://www.cdc.gov/nchs/data/nhds/4procedures/2010pro4_numberprocedureage.pdf. Published 2010. Accessed October 1, 2018.
6.
Hosmer  DW, Lemeshow  S.  Applied Logistic Regression. 2nd ed. New York, NY: Wiley; 2000:160-164. doi:10.1002/0471722146
7.
Kate  RJ, Perez  RM, Mazumdar  D, Pasupathy  KS, Nilakantan  V.  Prediction and detection models for acute kidney injury in hospitalized older adults.  BMC Med Inform Decis Mak. 2016;16:39. doi:10.1186/s12911-016-0277-4PubMedGoogle ScholarCrossref
8.
Huang  C, Murugiah  K, Mahajan  S,  et al.  Enhancing the prediction of acute kidney injury risk after percutaneous coronary intervention using machine learning techniques: a retrospective cohort study.  PLoS Med. 2018;15(11):e1002703. doi:10.1371/journal.pmed.1002703PubMedGoogle Scholar
9.
Koyner  JL, Carey  KA, Edelson  DP, Churpek  MM.  The development of a machine learning inpatient acute kidney injury prediction model.  Crit Care Med. 2018;46(7):1070-1077. doi:10.1097/CCM.0000000000003123PubMedGoogle ScholarCrossref
10.
Wu  L, Hu  Y, Liu  X,  et al.  Feature ranking in predictive models for hospital-acquired acute kidney injury.  Sci Rep. 2018;8(1):17298. doi:10.1038/s41598-018-35487-0PubMedGoogle ScholarCrossref
11.
Kheterpal  S, Tremper  KK, Englesbe  MJ,  et al.  Predictors of postoperative acute renal failure after noncardiac surgery in patients with previously normal renal function.  Anesthesiology. 2007;107(6):892-902. doi:10.1097/01.anes.0000290588.29668.38PubMedGoogle ScholarCrossref
12.
Kheterpal  S, Tremper  KK, Heung  M,  et al.  Development and validation of an acute kidney injury risk index for patients undergoing general surgery: results from a national data set.  Anesthesiology. 2009;110(3):505-515. doi:10.1097/ALN.0b013e3181979440PubMedGoogle ScholarCrossref
13.
Lee  CK, Hofer  I, Gabel  E, Baldi  P, Cannesson  M.  Development and validation of a deep neural network model for prediction of postoperative in-hospital mortality.  Anesthesiology. 2018;129(4):649-662. doi:10.1097/ALN.0000000000002186PubMedGoogle ScholarCrossref
14.
Freundlich  RE, Kheterpal  S.  Perioperative effectiveness research using large databases.  Best Pract Res Clin Anaesthesiol. 2011;25(4):489-498. doi:10.1016/j.bpa.2011.08.008PubMedGoogle ScholarCrossref
15.
Hastie  T, Tibshirani  R, Friedman  JH.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, NY: Springer; 2009. doi:10.1007/978-0-387-84858-7
16.
Collins  GS, Reitsma  JB, Altman  DG, Moons  KGM.  Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement.  Ann Intern Med. 2015;162(1):55-63. doi:10.7326/M14-0697PubMedGoogle ScholarCrossref
17.
American Medical Association. Current Procedural Terminology. 2018. https://www.ama-assn.org/practice-management/cpt-current-procedural-terminology. Accessed October 1, 2018.
18.
Surgery Flags Software for ICD-9-CM. https://www.hcup-us.ahrq.gov/toolssoftware/surgflags/surgeryflags.jsp. Updated August 7, 2019. Accessed November 11, 2018.
19.
Khwaja  A.  KDIGO clinical practice guidelines for acute kidney injury.  Nephron Clin Pract. 2012;120(4):c179-c184. doi:10.1159/000339789PubMedGoogle Scholar
20.
Md Ralib  A, Pickering  JW, Shaw  GM, Endre  ZH.  The urine output definition of acute kidney injury is too liberal.  Crit Care. 2013;17(3):R112. doi:10.1186/cc12784PubMedGoogle ScholarCrossref
21.
Adhikari  L, Ozrazgat-Baslanti  T, Ruppert  M,  et al.  Improved predictive models for acute kidney injury with IDEA: Intraoperative Data Embedded Analytics.  PLoS One. 2019;14(4):e0214904. doi:10.1371/journal.pone.0214904PubMedGoogle Scholar
22.
James  MT, Pannu  N, Hemmelgarn  BR,  et al.  Derivation and external validation of prediction models for advanced chronic kidney disease following acute kidney injury.  JAMA. 2017;318(18):1787-1797. doi:10.1001/jama.2017.16326PubMedGoogle ScholarCrossref
23.
Quan  H, Sundararajan  V, Halfon  P,  et al.  Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data.  Med Care. 2005;43(11):1130-1139. doi:10.1097/01.mlr.0000182534.19832.83PubMedGoogle ScholarCrossref
24.
Keats  AS.  The ASA classification of physical status—a recapitulation.  Anesthesiology. 1978;49(4):233-236. doi:10.1097/00000542-197810000-00001PubMedGoogle ScholarCrossref
25.
Healthcare Cost and Utilization Project. Clinical classifications software for services and procedures. 2017. https://www.hcup-us.ahrq.gov/toolssoftware/ccs_svcsproc/ccssvcproc.jsp. Accessed October 1, 2018.
26.
Jones  MP.  Indicator and stratification methods for missing explanatory variables in multiple linear regression.  J Am Stat Assoc. 1996;91(433):222-230. doi:10.1080/01621459.1996.10476680Google ScholarCrossref
27.
Meurer  WJ, Tolles  J.  Logistic regression diagnostics: understanding how well a model predicts outcomes.  JAMA. 2017;317(10):1068-1069. doi:10.1001/jama.2016.20441PubMedGoogle ScholarCrossref
28.
DeLong  ER, DeLong  DM, Clarke-Pearson  DL.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.  Biometrics. 1988;44(3):837-845. doi:10.2307/2531595PubMedGoogle ScholarCrossref
29.
Halbesma  N, Jansen  DF, Heymans  MW, Stolk  RP, de Jong  PE, Gansevoort  RT; PREVEND Study Group.  Development and validation of a general population renal risk score.  Clin J Am Soc Nephrol. 2011;6(7):1731-1738. doi:10.2215/CJN.08590910PubMedGoogle ScholarCrossref
30.
van der Laan  MJ, Polley  EC, Hubbard  AE.  Super learner.  Stat Appl Genet Mol Biol. 2007;6:e25. doi:10.2202/1544-6115.1309PubMedGoogle Scholar
31.
Bellomo  R, Ronco  C, Kellum  JA, Mehta  RL, Palevsky  P; Acute Dialysis Quality Initiative workgroup.  Acute renal failure—definition, outcome measures, animal models, fluid therapy and information technology needs: the Second International Consensus Conference of the Acute Dialysis Quality Initiative (ADQI) Group.  Crit Care. 2004;8(4):R204-R212. doi:10.1186/cc2872PubMedGoogle ScholarCrossref
32.
Mehta  RL, Kellum  JA, Shah  SV,  et al; Acute Kidney Injury Network.  Acute Kidney Injury Network: report of an initiative to improve outcomes in acute kidney injury.  Crit Care. 2007;11(2):R31. doi:10.1186/cc5713PubMedGoogle ScholarCrossref
Original Investigation
Nephrology
December 6, 2019

Risk Stratification for Postoperative Acute Kidney Injury in Major Noncardiac Surgery Using Preoperative and Intraoperative Data

Author Affiliations
  • 1Department of Medical Ethics and Health Policy, University of Pennsylvania Perelman School of Medicine, Philadelphia
  • 2Leonard Davis Institute of Health Economics, University of Pennsylvania Perelman School of Medicine, Philadelphia
  • 3Predictive Healthcare, University of Pennsylvania Health System, Philadelphia
  • 4University of Pennsylvania, Philadelphia
  • 5Department of Anesthesiology and Critical Care, University of Pennsylvania Health System, Philadelphia
  • 6Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia
  • 7The Wharton School, University of Pennsylvania, Philadelphia
  • 8Corporal Michael J. Cresencz Veterans Affairs Medical Center, Department of Veterans Affairs, Philadelphia, Pennsylvania
JAMA Netw Open. 2019;2(12):e1916921. doi:10.1001/jamanetworkopen.2019.16921
Key Points

Question  Is adding preoperative and intraoperative data associated with improved risk stratification of patients undergoing noncardiac surgery for postoperative acute kidney injury?

Findings  In this prognostic study of 42 615 patients who underwent noncardiac surgery, the addition of preoperative to prehospitalization data improved model performance (area under the curve increased from 0.71 to 0.80) as did adding preoperative plus intraoperative data (area under the curve further increased to 0.82).

Meaning  Although electronic health record data may be used to accurately stratify patients at risk of postoperative acute kidney injury, there appears to be only modest improvement in performance when adding intraoperative data to risk stratification models.

Abstract

Importance  Acute kidney injury (AKI) is one of the most common complications after noncardiac surgery. Yet current postoperative AKI risk stratification models have substantial limitations, such as limited use of perioperative data.

Objective  To examine whether adding preoperative and intraoperative data is associated with improved prediction of noncardiac postoperative AKI.

Design, Setting, and Participants  A prognostic study using logistic regression with elastic net selection, gradient boosting machine (GBM), and random forest approaches was conducted at 4 tertiary academic hospitals in the United States. A total of 42 615 hospitalized adults with serum creatinine measurements who underwent major noncardiac surgery between January 1, 2014, and April 30, 2018, were included in the study. Serum creatinine measurements from 365 days before and 7 days after surgery were used in this study.

Main Outcomes and Measures  Postoperative AKI (defined by the Kidney Disease Improving Global Outcomes within 7 days after surgery) was the primary outcome. The area under the receiver operating characteristic curve (AUC) was used to assess discrimination.

Results  Among 42 615 patients who underwent noncardiac surgery, the mean (SD) age was 57.9 (15.7) years, 23 943 (56.2%) were women, 27 857 (65.4%) were white, and the most frequent surgery types were orthopedic (15 718 [36.9%]), general (8808 [20.7%]), and neurologic (6564 [15.4%]). The rate of postoperative AKI was 10.1% (n = 4318). The progressive addition of clinical data improved model performance across all modeling approaches, with GBM providing the highest discrimination by AUC. In GBM models, the AUC increased from 0.712 (95% CI, 0.694-0.731) using prehospitalization variables to 0.804 (95% CI, 0.788-0.819) using preoperative variables (inclusive of prehospitalization variables) (P < .001 for AUC comparison). The AUC further increased to 0.817 (95% CI, 0.802-0.832) when adding intraoperative variables (P < .001 for comparison vs model using preoperative variables). However, the statistically significant improvements in discrimination did not appear to be clinically significant. In particular, the AKI rate among patients classified as high risk improved from 29.1% to 30.0%, a net of 15 patients were appropriately reclassified as high risk, and an additional 15 patients were appropriately reclassified as low risk.

Conclusions and Relevance  The findings of the study suggest that electronic health record data may be used to accurately stratify patients at risk of perioperative AKI, but the modest improvements from adding intraoperative data should be weighed against challenges in using intraoperative data.

Introduction

Acute kidney injury (AKI) is a common postoperative complication, occurring in 12% of patients undergoing surgical procedures,1 that has been associated with poor clinical outcomes, including the development of chronic kidney disease, increased health care use, and death.2,3 Because of evidence describing the association of AKI with mortality,4 there has been heightened interest in improved risk stratification for postoperative AKI among the 40 million patients undergoing noncardiac surgery in the United States annually.5 To our knowledge, no consensus risk stratification algorithms or tools exist either before or after surgery. Improving risk stratification may be helpful for preoperative and perioperative management in the setting of noncardiac surgery.

Existing models to predict AKI provide moderate6 levels of accuracy,7-10 although they have not used consistent definitions of the AKI outcome, have used a mix of statistical and machine learning approaches, and have not uniformly focused on noncardiac surgery. For example, large studies of AKI after general or other noncardiac surgery demonstrated moderate predictive accuracy (eg, area under the receiver operating characteristic curve [AUC], 0.73-0.80), but predated current consensus standards on AKI definition.11,12 The lack of common definitions and methods underscores the need to compare performance across these various approaches. Furthermore, while some studies have used data from the electronic health record (EHR), they have not incorporated detailed physiological and clinical data (eg, vital signs, dosages of vasopressor medications, blood loss) collected intraoperatively. Because adding such data improves risk stratification for other postoperative complications,13 these data may also yield improvements in risk stratification for AKI.

In this study, we examined whether adding intraoperative data was associated with improved prediction of noncardiac postoperative AKI compared with models using administrative and preoperative clinical information alone. Furthermore, we compared performance across multiple statistical and machine learning approaches and definitions of AKI.

Methods
Study Data

Electronic health record data were collected on adult patients undergoing noncardiac surgery during an inpatient admission between January 1, 2014, and April 30, 2018, at the University of Pennsylvania Health System. We used code developed by the Multicenter Perioperative Outcomes Group that was run on University of Pennsylvania Health System Epic Clarity databases to standardize intraoperative and postoperative data and combined the data with administrative and preoperative data.14 Cohort data were randomly split by patient into derivation (60%), validation (20%), and test (20%) sets.15

The University of Pennsylvania Institutional Review Board approved the study design and granted a waiver of informed consent from study participants for secondary use of electronic health records. This study follows the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guideline.16

Study Population

Patients 18 years or older across 4 academic medical centers in University of Pennsylvania Health System during the study period were included if they underwent major noncardiac surgery. We identified noncardiac surgery using primary Current Procedural Terminology codes (10021-32999, 34001-69990)17 and restricted to major therapeutic procedures using Agency for Healthcare Research Quality Healthcare Cost Utilization Project Surgery Flag Software.18 We focused on noncardiac surgery because the association between preoperative and intraoperative variables and AKI likely differ for cardiac surgery owing to the use of cardiopulmonary bypass.

Patients who underwent multiple major surgical procedures during the same visit were excluded (4249 [5.4%] of surgical cases) to avoid overlap between preoperative and postoperative periods. In addition, patients were excluded if they did not have at least 1 preoperative and postoperative serum creatinine measurement (27 704 [35.5%] of surgical cases), had end-stage renal disease and underwent dialysis within the past year, had an elevated baseline serum creatinine level greater than or equal to 4.5 mg/dL (to convert to micromoles per liter, multiply by 88.4),9 or if they met criteria for AKI within the 7 days before surgery (additional details and billing codes in eMethods in the Supplement).

Outcomes

Our primary outcome was the incidence of AKI within 7 days after surgery. For our primary analyses, we used the Kidney Disease Improving Global Outcomes guidelines for stage 1 AKI, defined as a serum creatinine level increase of 1.5 times baseline or of 0.3 mg/dL in a 48-hour period.19 We excluded the urine output criteria owing to concerns for poor specificity for AKI classification20 and the lack of reliable data in our data set. If discharge occurred earlier than 7 days after surgery and there was no evidence of AKI to date, an outcome of no AKI was assigned. Secondary outcomes included use of inpatient dialysis, a postsurgical length of stay of 7 or more days (to reflect a prolonged postsurgical stay), and in-hospital mortality (eMethods in the Supplement).

Baseline Kidney Function Assessment

Baseline values were defined first as the lowest serum creatinine measurement value and estimated glomerular filtration rate value within 7 days before the start of surgery21 or, if no values were present, the most recent value up to 365 days before the surgery.22

Variables

The unit of observation was an inpatient hospitalization for noncardiac surgery. Variables were split into 3 groups reflecting increasing inclusiveness of data: prehospitalization, preoperative, and perioperative variables. Prehospitalization variables included age, sex, race, and insurance type. Historical comorbidities were also included, derived from International Classification of Diseases, Clinical Modification, Ninth Revision, and International Statistical Classification of Diseases, Clinical Modification, 10th Revision, diagnostic codes.23 Preoperative variables combined the prehospitalization variables with clinical information related to the patient’s admission but before surgery, such as laboratory measurements, American Society of Anesthesiologists physical status,24 and surgical procedure type. To categorize operations, we used Agency for Healthcare Research Quality Healthcare Cost Utilization Project Clinical Classification Software to map each primary Current Procedural Terminology code to 244 unique procedure groups.25 Data for these variables were collected from the start of the admission up until the start of the surgical procedure. Perioperative variables added intraoperative data to preoperative variables. Intraoperative data included variables such as heart rate and blood pressure; fluid status, such as total fluid administration and estimated blood loss; and drug use, such as vasopressors and intraoperative rescue medications (eg, calcium chloride). Data for this category were collected between the start and end of the surgical procedure using timestamps in the EHR (full list of variables reported in the eAppendix in the Supplement).

Missing Data on Variables

Because some variables contain data artifacts and extreme values, we set variables with values below the first percentile to the first percentile value and values greater than the 99th percentile to the 99th percentile value. After data cleaning, rates of missing data within observations ranged from 0.10% (ie, intraoperative heart rate) to 98.6% (ie, N-terminal pro b-type natriuretic peptide laboratory measurement) (eTable 1 in the Supplement). To avoid excluding observations that were missing data on predictor variables, we added dichotomous variables for each covariate that indicated whether an observation had a missing value. For observations with a missing indicator equal to 1, the missing covariate data were replaced with a fixed value.26 This approach allowed us to use a larger study sample while preserving information about present vs missing values. This approach is more flexible than general mean imputation and less stringent than the common missing-at-random assumption required in multiple imputation.

Statistical Analysis

To examine improvements in predictive accuracy and risk stratification when adding more variables throughout the surgery encounter, we implemented models for each variable group (prehospitalization, preoperative, and perioperative) separately. We used 3 modeling approaches: logistic regression with elastic net selection, random forest, and gradient boosting machines (GBMs), which we applied to each definition of AKI. For random forest and GBM models, we used a randomized grid search using 3-folds across 30 iterations on our derivation data set for selecting optimal model parameters. For GBMs, we used decision trees as the weak learner with logistic regression for the loss function. Validation sets were used to evaluate, verify, and finalize our model parameters. Final model results are reported for the test sets of data only.

Model Performance

We compared differences between the development, validation, and test data sets and reported results of model performance using the test data sets (20% of sample). Categorical variables were compared using χ2 tests and continuous variables were compared using Mann-Whitney tests. Model performance was assessed using the AUC,27 which we calculated by comparing the AKI estimated from the models with observed AKI. We calculated 95% CIs using the method of DeLong et al28 with 1000 bootstrapping samples to test for significance between models. We compared model performance within each of the 3 modeling approaches for each of the 3 groups of variables (reflecting the progressive addition of data), as well as across the 3 modeling approaches when using the same group of data elements.

Risk Stratification

To illustrate implications for clinical utility, we stratified patients as high and low risk for Kidney Disease Improving Global Outcomes AKI and compared incidence rates of our primary and secondary outcomes associated with AKI. Patients were stratified into a high-risk category if their predicted risk for AKI was in the top 20% of the test data set population (n = 8494),29 with the remaining 80% of patients stratified into a low-risk category. Risk stratification was conducted on prehospitalization, preoperative, and perioperative data sets, examined for primary and secondary outcomes, and examined by patient encounters with and without events.

Sensitivity Analyses

We tested the sensitivity of our results to several data and modeling decisions, including using a super learner algorithm, classifying outlier data values as missing, by surgical type (eg, orthopedic, general, and neurologic), and alternative definitions of AKI (eMethods in the Supplement).30-32 Given the lack of an evidence-based definition of a high-risk probability value for AKI, the top 20% was arbitrarily selected and so we examined sensitivity to cutoff by using top 10% and top 30%.

Logistic regression with elastic net selection (PROC GLMSELECT) was implemented using SAS software, version 9.4 (SAS Institute Inc). Super Learner was implemented using the R, version 3.4.3 SuperLearner Package (R Foundation). All other code and predictive models (RandomForestClassifier, GradientBoostingClassifier) were conducted in Python, version 3.6 (Python Software Foundation), with Pandas 0.23.3 and Scikit-learn 0.19.1 libraries. Two-tailed tests were considered statistically significant at P < .05.

Results
Study Population

Of the 77 975 patients who underwent major noncardiac surgery, we identified 42 615 noncardiac surgical patient encounters that met study criteria (Table 1). Mean (SD) patient age was 57.9 (15.7) years, 23 943 (56.2%) patients were women, 27 857 (65.4%) patients were white, and 19 470 patients (45.7%) had commercial insurance. The most common surgery types were orthopedic (15 718 [36.9%]), general (8808 [20.7%]), and neurologic (6564 [15.4%]). Most patients were classified as American Society of Anesthesiologists physical status 3 (severe systemic disease) or 2 (mild systemic disease) before surgery.24 A total of 3859 patients (9.1%) had multiple operations during the study period. Of the study sample, 4318 patients (10.1%) experienced AKI (Table 2), which was similar across definitions (eTable 2 in the Supplement). In addition, 103 patients (0.2%) underwent inpatient dialysis, 8335 patients (19.6%) experienced a postoperative length of stay of 7 or more days, and 255 patients (0.6%) died in the hospital. Patient characteristics, rates of AKI, and other clinical outcomes did not exhibit substantial differences between derivation, validation, and test sets (Table 2).

Model Performance

Among the 8494 patients in the test set, 845 patients (9.9%) experienced Kidney Disease Improving Global Outcomes AKI (Table 2). Use of logistic regression with elastic net selection resulted in increasing AUCs as clinical variables were added (Figure): the AUC was 0.700 (95% CI, 0.681-0.719) with prehospitalization variables, 0.782 (95% CI, 0.765-0.799) with preoperative variables that included prehospitalization variables (P < .001 for AUC comparison vs model using prehospitalization variables only), and 0.790 (95% CI, 0.773-0.807) with perioperative variables that included intraoperative variables (P = .02 for AUC comparison vs model using preoperative variables only). The random forest models resulted in an AUC of 0.710 (95% CI, 0.690-0.728) with prehospitalization variables, a higher AUC of 0.787 (95% CI, 0.770-0.803) with preoperative variables (P < .001 for AUC comparison vs model using prehospitalization variables only), and the highest AUC of 0.808 (95% CI, 0.790-0.823) using perioperative variables (P < .001 for AUC comparison vs model using preoperative variables only). The GBM models generated the highest AUCs across all models with an AUC of 0.712 (95% CI, 0.694-0.731) using the prehospitalization variables, a higher AUC of 0.804 (95% CI, 0.788-0.819) with preoperative variables (P < .001 for AUC comparison vs model using prehospitalization variables only), and the highest AUC of 0.817 (95% CI, 0.802-0.832) when using perioperative variables (P < .001 for AUC comparison vs model using prehospitalization variables only). Full model performance across data sets, calibration curves, and variable coefficients and importance can be found in eTables 3-7 and the eFigure in the Supplement.

Risk Stratification

A total of 1699 of the 8494 patients (20.0%) were classified as high risk and 6795 patients (80.0%) were classified as low risk, using the GBM model (Table 3 and Table 4). We applied this risk stratification to each group of variables separately (reflecting progressive addition of clinical variables) and compared classification. Although the improvement in discrimination was statistically significant when adding perioperative data, the improvement did not appear to be clinically significant. In particular, the AKI rate among patients classified as high risk improved from 29.1% to 30.0%; however, only a net of 15 patients were appropriately reclassified as high risk (ie, 67 patients were reclassified appropriately as high risk, but 52 patients were reclassified inappropriately as low risk) and an additional net of 15 patients were appropriately reclassified as low risk (ie, 329 patients were appropriately reclassified as low risk but 314 patients were inappropriately reclassified as high risk) (Table 3).

The small improvements were concordant across primary and secondary outcomes (Table 4). Rates of Kidney Disease Improving Global Outcomes AKI in the high-risk groups increased as more data were added (prehospitalization, 22.3%; preoperative, 29.1%; perioperative, 30.0%). Rates of secondary outcomes increased similarly: inpatient dialysis (prehospitalization, 1.3%; preoperative, 1.7%; perioperative, 1.8%), postoperative length of stay greater than or equal to 7 days (prehospitalization, 33.4%; preoperative, 43.4%; perioperative, 45.6%), and in-hospital death (prehospitalization, 2.0%; preoperative, 2.4%; perioperative, 3.0%). The largest increases were observed after adding preoperative data, while smaller increases were observed after adding intraoperative data.

Sensitivity Analyses

The results of several sensitivity analyses were consistent with our main results (eTables 8-12 in the Supplement).

Discussion

The findings of this study suggest that clinical EHR data can be used to develop reasonably accurate predictive models for risk-stratifying adults undergoing major noncardiac surgery for postoperative AKI. Model performance increased as more clinical information was incorporated, with the largest performance gains noted when preoperative data were added. This finding was robust to different modeling techniques and definitions of AKI.

However, the gains in accuracy from adding intraoperative data to preoperative data were modest at best, showing only marginal gains in the AUC, and did not seem to be clinically meaningful. These results were similarly reflected in risk stratification. For example, of the entire test set population of 8494 patients, only 30 were appropriately reclassified as high or low risk when adding perioperative data. This finding may suggest that adding intraoperative data to risk stratification models for AKI may not yield substantial benefits relative to the complexity in implementation. This is further highlighted by the contrast in results for models of other postoperative complications, such as in-hospital mortality, for which the addition of intraoperative data yields substantial improvements in risk stratification.13

Although our models did not demonstrate substantially higher discrimination on average across the entire study population, there may be subgroups of patients for whom addition of intraoperative data improves risk stratification in a clinically meaningful fashion. Additional research exploring subgroups is underway as part of a broader effort to implement such algorithms into practice. One feature of the models we used is that they are suited to implementation in electronic systems that receive or pull data from the EHR.

Another contribution of this study was to implement multiple statistical and machine learning methods as well as use of multiple definitions of AKI as the primary outcome. This approach suggests that our results may reflect the accuracy of risk stratification models for AKI and highlights that variability in modeling approach and AKI outcome definitions may be unlikely to explain differences in discrimination (ie, AUCs ranging from 0.73 to 0.80) in previous studies.8-10

Limitations

The study has several limitations. First, this was a single-institution study and the availability of EHR data as well as practice patterns may vary at other institutions. However, we used data from multiple hospitals within a health system with different surgery and anesthesia groups and clinicians. Furthermore, the intraoperative data that we used are likely captured as part of routine monitoring of patients while in surgery. Third, our follow-up period was limited to the hospital setting and there may have been limited documentation of other important clinical outcomes. We did not capture longitudinal outcomes, which may affect the ability to risk stratify for other important, longer-term outcomes. Fourth, we did not have reliable data on urine output, which could have led to incomplete identification of AKI.

Conclusions

The findings of this study suggest that EHR data can be used to accurately stratify patients at risk of perioperative AKI. However, the modest improvements in performance from adding intraoperative data should be weighed against clinical utility and examination of whether particular subgroups may benefit from the addition requires further research.

Back to top
Article Information

Accepted for Publication: October 11, 2019.

Published: December 6, 2019. doi:10.1001/jamanetworkopen.2019.16921

Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2019 Lei VJ et al. JAMA Network Open.

Corresponding Author: Victor J. Lei, PharmD, Department of Medical Ethics and Health Policy, University of Pennsylvania Perelman School of Medicine, 1127 Blockley Hall, Philadelphia, PA 19104 (vlei@pennmedicine.upenn.edu).

Author Contributions: Drs Lei and Navathe had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Concept and design: Lei, Luong, Neuman, Polsky, Holmes, Navathe.

Acquisition, analysis, or interpretation of data: All authors.

Drafting of the manuscript: Lei, Shan, Chen, Navathe.

Critical revision of the manuscript for important intellectual content: Lei, Luong, Neuman, Eneanya, Polsky, Volpp, Fleisher, Holmes, Navathe.

Statistical analysis: Lei, Shan, Chen, Neuman, Navathe.

Obtained funding: Polsky, Volpp, Navathe.

Administrative, technical, or material support: Lei, Luong, Holmes, Navathe.

Supervision: Luong, Neuman, Eneanya, Holmes, Navathe.

Conflict of Interest Disclosures: Dr Volpp reported receiving grants from Humana, Hawaii Medical Service Association, Discovery (South Africa), Merck, Weight Watchers, and CVS outside of the submitted work; has received consulting income from CVS and VALHealth; and is a principal in VALHealth, a behavioral economics consulting firm. Dr Holmes receives funding from the Pennsylvania Department of Health, US Public Health Service, and the Cardiovascular Medicine Research and Education Foundation. Dr Navathe reported receiving grants from the Pennsylvania Department of Health, Hawaii Medical Services Association, Anthem Public Policy Institute, The Commonwealth Fund, Oscar Health, Cigna Corporation, Robert Wood Johnson Foundation, and Donaghue Foundation; personal fees and equity from Agathos Inc; personal fees from Navvis Healthcare, University Health System (Singapore), Elsevier Press, Navahealth, and Cleveland Clinic; personal fees for service as a commissioner from the Medicare Payment Advisory Commission; serving as a board member without compensation for Integrated Services Inc; and holding equity from Embedded Healthcare outside the submitted work.

Funding/Support: This project was funded, in part, under a grant with the Pennsylvania Department of Health (SAP 4100070). Dr Eneanya is supported by National Institutes of Health grant K23DK114526.

Role of the Funder/Sponsor: The Pennsylvania Department of Health and National Institutes of Health had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

References
1.
Grams  ME, Sang  Y, Coresh  J,  et al.  Acute kidney injury after major surgery: a retrospective analysis of Veterans Health Administration data.  Am J Kidney Dis. 2016;67(6):872-880. doi:10.1053/j.ajkd.2015.07.022PubMedGoogle ScholarCrossref
2.
Cho  E, Kim  SC, Kim  MG, Jo  SK, Cho  WY, Kim  HK.  The incidence and risk factors of acute kidney injury after hepatobiliary surgery: a prospective observational study.  BMC Nephrol. 2014;15:169. doi:10.1186/1471-2369-15-169PubMedGoogle ScholarCrossref
3.
Saran  R, Robinson  B, Abbott  KC,  et al.  US Renal Data System 2017 annual data report: epidemiology of kidney disease in the United States.  Am J Kidney Dis. 2018;71(3)(suppl 1):A7. doi:10.1053/j.ajkd.2018.01.002PubMedGoogle ScholarCrossref
4.
Biteker  M, Dayan  A, Tekkeşin  AI,  et al.  Incidence, risk factors, and outcomes of perioperative acute kidney injury in noncardiac and nonvascular surgery.  Am J Surg. 2014;207(1):53-59. doi:10.1016/j.amjsurg.2013.04.006PubMedGoogle ScholarCrossref
5.
National Hospital Discharge Survey: 2010 table, Procedures by selected patient characteristics. https://www.cdc.gov/nchs/data/nhds/4procedures/2010pro4_numberprocedureage.pdf. Published 2010. Accessed October 1, 2018.
6.
Hosmer  DW, Lemeshow  S.  Applied Logistic Regression. 2nd ed. New York, NY: Wiley; 2000:160-164. doi:10.1002/0471722146
7.
Kate  RJ, Perez  RM, Mazumdar  D, Pasupathy  KS, Nilakantan  V.  Prediction and detection models for acute kidney injury in hospitalized older adults.  BMC Med Inform Decis Mak. 2016;16:39. doi:10.1186/s12911-016-0277-4PubMedGoogle ScholarCrossref
8.
Huang  C, Murugiah  K, Mahajan  S,  et al.  Enhancing the prediction of acute kidney injury risk after percutaneous coronary intervention using machine learning techniques: a retrospective cohort study.  PLoS Med. 2018;15(11):e1002703. doi:10.1371/journal.pmed.1002703PubMedGoogle Scholar
9.
Koyner  JL, Carey  KA, Edelson  DP, Churpek  MM.  The development of a machine learning inpatient acute kidney injury prediction model.  Crit Care Med. 2018;46(7):1070-1077. doi:10.1097/CCM.0000000000003123PubMedGoogle ScholarCrossref
10.
Wu  L, Hu  Y, Liu  X,  et al.  Feature ranking in predictive models for hospital-acquired acute kidney injury.  Sci Rep. 2018;8(1):17298. doi:10.1038/s41598-018-35487-0PubMedGoogle ScholarCrossref
11.
Kheterpal  S, Tremper  KK, Englesbe  MJ,  et al.  Predictors of postoperative acute renal failure after noncardiac surgery in patients with previously normal renal function.  Anesthesiology. 2007;107(6):892-902. doi:10.1097/01.anes.0000290588.29668.38PubMedGoogle ScholarCrossref
12.
Kheterpal  S, Tremper  KK, Heung  M,  et al.  Development and validation of an acute kidney injury risk index for patients undergoing general surgery: results from a national data set.  Anesthesiology. 2009;110(3):505-515. doi:10.1097/ALN.0b013e3181979440PubMedGoogle ScholarCrossref
13.
Lee  CK, Hofer  I, Gabel  E, Baldi  P, Cannesson  M.  Development and validation of a deep neural network model for prediction of postoperative in-hospital mortality.  Anesthesiology. 2018;129(4):649-662. doi:10.1097/ALN.0000000000002186PubMedGoogle ScholarCrossref
14.
Freundlich  RE, Kheterpal  S.  Perioperative effectiveness research using large databases.  Best Pract Res Clin Anaesthesiol. 2011;25(4):489-498. doi:10.1016/j.bpa.2011.08.008PubMedGoogle ScholarCrossref
15.
Hastie  T, Tibshirani  R, Friedman  JH.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, NY: Springer; 2009. doi:10.1007/978-0-387-84858-7
16.
Collins  GS, Reitsma  JB, Altman  DG, Moons  KGM.  Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement.  Ann Intern Med. 2015;162(1):55-63. doi:10.7326/M14-0697PubMedGoogle ScholarCrossref
17.
American Medical Association. Current Procedural Terminology. 2018. https://www.ama-assn.org/practice-management/cpt-current-procedural-terminology. Accessed October 1, 2018.
18.
Surgery Flags Software for ICD-9-CM. https://www.hcup-us.ahrq.gov/toolssoftware/surgflags/surgeryflags.jsp. Updated August 7, 2019. Accessed November 11, 2018.
19.
Khwaja  A.  KDIGO clinical practice guidelines for acute kidney injury.  Nephron Clin Pract. 2012;120(4):c179-c184. doi:10.1159/000339789PubMedGoogle Scholar
20.
Md Ralib  A, Pickering  JW, Shaw  GM, Endre  ZH.  The urine output definition of acute kidney injury is too liberal.  Crit Care. 2013;17(3):R112. doi:10.1186/cc12784PubMedGoogle ScholarCrossref
21.
Adhikari  L, Ozrazgat-Baslanti  T, Ruppert  M,  et al.  Improved predictive models for acute kidney injury with IDEA: Intraoperative Data Embedded Analytics.  PLoS One. 2019;14(4):e0214904. doi:10.1371/journal.pone.0214904PubMedGoogle Scholar
22.
James  MT, Pannu  N, Hemmelgarn  BR,  et al.  Derivation and external validation of prediction models for advanced chronic kidney disease following acute kidney injury.  JAMA. 2017;318(18):1787-1797. doi:10.1001/jama.2017.16326PubMedGoogle ScholarCrossref
23.
Quan  H, Sundararajan  V, Halfon  P,  et al.  Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data.  Med Care. 2005;43(11):1130-1139. doi:10.1097/01.mlr.0000182534.19832.83PubMedGoogle ScholarCrossref
24.
Keats  AS.  The ASA classification of physical status—a recapitulation.  Anesthesiology. 1978;49(4):233-236. doi:10.1097/00000542-197810000-00001PubMedGoogle ScholarCrossref
25.
Healthcare Cost and Utilization Project. Clinical classifications software for services and procedures. 2017. https://www.hcup-us.ahrq.gov/toolssoftware/ccs_svcsproc/ccssvcproc.jsp. Accessed October 1, 2018.
26.
Jones  MP.  Indicator and stratification methods for missing explanatory variables in multiple linear regression.  J Am Stat Assoc. 1996;91(433):222-230. doi:10.1080/01621459.1996.10476680Google ScholarCrossref
27.
Meurer  WJ, Tolles  J.  Logistic regression diagnostics: understanding how well a model predicts outcomes.  JAMA. 2017;317(10):1068-1069. doi:10.1001/jama.2016.20441PubMedGoogle ScholarCrossref
28.
DeLong  ER, DeLong  DM, Clarke-Pearson  DL.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.  Biometrics. 1988;44(3):837-845. doi:10.2307/2531595PubMedGoogle ScholarCrossref
29.
Halbesma  N, Jansen  DF, Heymans  MW, Stolk  RP, de Jong  PE, Gansevoort  RT; PREVEND Study Group.  Development and validation of a general population renal risk score.  Clin J Am Soc Nephrol. 2011;6(7):1731-1738. doi:10.2215/CJN.08590910PubMedGoogle ScholarCrossref
30.
van der Laan  MJ, Polley  EC, Hubbard  AE.  Super learner.  Stat Appl Genet Mol Biol. 2007;6:e25. doi:10.2202/1544-6115.1309PubMedGoogle Scholar
31.
Bellomo  R, Ronco  C, Kellum  JA, Mehta  RL, Palevsky  P; Acute Dialysis Quality Initiative workgroup.  Acute renal failure—definition, outcome measures, animal models, fluid therapy and information technology needs: the Second International Consensus Conference of the Acute Dialysis Quality Initiative (ADQI) Group.  Crit Care. 2004;8(4):R204-R212. doi:10.1186/cc2872PubMedGoogle ScholarCrossref
32.
Mehta  RL, Kellum  JA, Shah  SV,  et al; Acute Kidney Injury Network.  Acute Kidney Injury Network: report of an initiative to improve outcomes in acute kidney injury.  Crit Care. 2007;11(2):R31. doi:10.1186/cc5713PubMedGoogle ScholarCrossref
×