Risk Stratification for Postoperative Acute Kidney Injury in Major Noncardiac Surgery Using Preoperative and Intraoperative Data

Key Points Question Is adding preoperative and intraoperative data associated with improved risk stratification of patients undergoing noncardiac surgery for postoperative acute kidney injury? Findings In this prognostic study of 42 615 patients who underwent noncardiac surgery, the addition of preoperative to prehospitalization data improved model performance (area under the curve increased from 0.71 to 0.80) as did adding preoperative plus intraoperative data (area under the curve further increased to 0.82). Meaning Although electronic health record data may be used to accurately stratify patients at risk of postoperative acute kidney injury, there appears to be only modest improvement in performance when adding intraoperative data to risk stratification models.


Secondary Outcomes
Secondary outcomes included inpatient dialysis, a post-surgical length of stay ≥ 7 days (to reflect a prolonged post-surgical stay), and all-cause in-hospital death. Inpatient dialysis was identified if a procedure code in the list below was present during index hospitalization. The post-surgical length of stay threshold of ≥ 7 days was selected as a marker for prolonged length of stay. In the absence of a well-defined threshold in the literature, we selected 7 days after surgery because it aligned with our post-operative AKI definition (which was defined up to 7 days after surgery). All-cause in-hospital death was defined as patient death anytime between the end of surgery and before discharge of index hospitalization.

Sensitivity Analyses
We tested the sensitivity of our results to several data and modeling decisions.

1) Ensembling Models -Super Learner
In response to an editor's comment, we examined the performance difference between our models and an ensemble technique like Super Learner, 2 which are beginning to become more common. Algorithms chosen for analysis were penalized logistic regression (glmnet), gradient boosting machine (gbm), XGboost (xgboost), and random forest (randomForest).

2) Alternate Method Handling for Extreme and Artifact Values
In response to an editor's comment, we test whether results were sensitive to treating outlier and extreme variable data as missing instead of our main approach. Values below the 1st percentile and values greater than the 99th percentile were set to missing for this analysis. All modeling and analysis were consistent otherwise.

3) Surgical Subgroup Analysis
In response to a reviewer comment, we examine if model performance differed by surgical specialty-specific models. Models were trained and tested in each respective subgroups. Results were compared to the main analysis.

4) Alternate Acute Kidney Injury Definitions
To address multiple definitions of AKI by professional societies, we also used two other definitions: (1) Risk, Injury, and Failure; and Loss; and End-stage kidney disease (RIFLE) 3 classification of risk--developed by Acute Dialysis Quality Initiative--was defined as an increase of SCr by 1.5 times, or estimated glomerular filtration rate (eGFR) decrease by 25%, and (2) Acute Kidney Injury Network (AKIN) stage 1 was defined as an absolute increase in serum creatinine by 0.3 mg/dl (26.4 μmol/l) or a 50% increase in serum creatinine (i.e. 1.5-fold from baseline). 4 eGFR was calculated using the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation. 5 Models for main analysis will be reproduced with the two alternate definitions of AKI.

5) High Risk Stratification Cutoff Analysis
Given the lack of an evidence-based definition of a high-risk probability value for AKI, the top 20% was arbitrarily selected and we examined sensitivity to cutoff by used top 10% and top 30%.

eAppendix. Variables
A total of 339 baseline, preoperative, and intraoperative variables were constructed. Final models in main and sensitivity analyses were derived using a set of components of these variables. All categorical variables were one-hot encoded to unique binary variables.

eTable 1. Rates of Missing Data in Variables
The number of observations with missing data for variables were calculated in the study sample (n = 42,615). The table below shows the number of observations with missing data only for the variables that contain missing data.

eTable 10. Model Performance in Top 3 Surgical Subgroups in Test Dataset
In response to a reviewer comment, we examined specialty-specific models (top 3 highest volume) and did find some variability, though lower sample sizes seemed like the most important reason for variation in model performance.