[Skip to Content]
Access to paid content on this site is currently suspended due to excessive activity being detected from your IP address 54.205.176.107. Please contact the publisher to request reinstatement.
Sign In
Individual Sign In
Create an Account
Institutional Sign In
OpenAthens Shibboleth
[Skip to Content Landing]
Download PDF
Table 1.  
Patient Demographics and Outcomes Within a 30-Day Postoperative Period
Patient Demographics and Outcomes Within a 30-Day Postoperative Period
Table 2.  
AUC of Models Predicting Mortality and Complicationsa
AUC of Models Predicting Mortality and Complicationsa
1.
Anderson  JE, Lassiter  R, Bickler  SW, Talamini  MA, Chang  DC.  Brief tool to measure risk-adjusted surgical outcomes in resource-limited hospitals. Arch Surg. 2012;147(9):798-803.
PubMedArticle
2.
Dimick  JB, Osborne  NH, Hall  BL, Ko  CY, Birkmeyer  JD.  Risk adjustment for comparing hospital quality with surgery: how many variables are needed? J Am Coll Surg. 2010;210(4):503-508.
PubMedArticle
3.
Rubinfeld  I, Farooq  M, Velanovich  V,  et al.  Predicting surgical risk: how much data is enough? AMIA Annu Symp Proc. 2010;777-781.
PubMed
4.
Birkmeyer  JD, Shahian  DM, Dimick  JB,  et al.  Blueprint for a new American College of Surgeons: National Surgical Quality Improvement Program. J Am Coll Surg. 2008;207(5):777-782.
PubMedArticle
5.
Murdoch  TB, Detsky  AS.  The inevitable application of big data to health care. JAMA. 2013;309(13):1351-1352.
PubMedArticle
6.
American College of Surgeons National Surgical Quality Improvement Program. User Guide for the 2010 Participant Use Data File. American College of Surgeons; Chicago, IL: 2012.
7.
The Leapfrog Hospital Survey Reference Book: Supporting Documentation for the 2013 Leapfrog Hospital Survey. https://leapfroghospitalsurvey.org/web/wp-content/uploads/reference.pdf. Accessed May 14, 2013.
8.
Healey  C, Osler  TM, Rogers  FB,  et al.  Improving the Glasgow Coma Scale score: motor score alone is a better predictor. J Trauma. 2003;54(4):671-678, discussion 678-680.
PubMedArticle
9.
Bewick  V, Cheek  L, Ball  J.  Statistics review 13: receiver operating characteristic curves. Crit Care. 2004;8(6):508-512.
PubMedArticle
10.
Meredith  JW, Evans  G, Kilgo  PD,  et al.  A comparison of the abilities of nine scoring algorithms in predicting mortality. J Trauma. 2002;53(4):621-628, discussion 628-629.
PubMedArticle
11.
Connell  FA, Koepsell  TD.  Measures of gain in certainty from a diagnostic test. Am J Epidemiol. 1985;121(5):744-753.
PubMedArticle
12.
American College of Surgeons National Surgical Quality Improvement Program. http://site.acsnsqip.org/participants/. Accessed September 23, 2013.
13.
Gibbs  J, Cull  W, Henderson  W, Daley  J, Hur  K, Khuri  SF.  Preoperative serum albumin level as a predictor of operative mortality and morbidity: results from the National VA Surgical Risk Study. Arch Surg. 1999;134(1):36-42.
PubMedArticle
Original Investigation
January 2015

Using Electronic Health Records for Surgical Quality Improvement in the Era of Big Data

Author Affiliations
  • 1Department of Surgery, University of California, San Diego
JAMA Surg. 2015;150(1):24-29. doi:10.1001/jamasurg.2014.947
Abstract

Importance  Risk adjustment is an important component of quality assessment in surgical health care. However, data collection places an additional burden on physicians. There is also concern that outcomes can be gamed depending on the information recorded for each patient.

Objective  To determine whether a number of machine-collected data elements could perform as well as a traditional full-risk adjustment model that includes other physician-assessed and physician-recorded data elements.

Design, Settings, and Participants  All general surgery patients from the National Surgical Quality Improvement Program database from January 1, 2005, to December 31, 2010, were included. Separate multivariate logistic regressions were performed using either all 66 preoperative risk variables or only 25 objective variables. The area under the receiver operating characteristic curve (AUC) of each regression using objective preoperative risk variables was compared with its corresponding regression with all preoperative variables. Subset analyses were performed among patients who received certain operations.

Main Outcomes and Measures  Mortality or any surgical complication captured by the National Surgical Quality Improvement Program, both inpatient and within 30 days postoperatively.

Results  Data from a total of 745 053 patients were included. More than 15.8% of patients had at least 1 complication and the mortality rate was 2.8%. When examining inpatient mortality, the AUC was 0.9104 with all 66 variables vs 0.8918 with all 25 objective variables. The difference in AUC comparing models with all variables with objective variables ranged from −0.0073 to 0.1944 for mortality and 0.0198 to 0.0687 for complications. In models predicting mortality, the difference in AUC was less than 0.05 among all patients and subsets of patients with abdominal aortic aneurysm repair, pancreatic resection, colectomy, and appendectomy. In models predicting complications, the difference in AUC was less than 0.05 among all patients and subsets of patients with pancreatic resection, laparoscopic cholecystectomy, colectomy, and appendectomy.

Conclusions and Relevance  Rigorous risk-adjusted surgical quality assessment can be performed solely with objective variables. By leveraging data already routinely collected for patient care, this approach allows for wider adoption of quality assessment systems in health care. Identifying data elements that can be automatically collected can make future improvements to surgical outcomes and quality analyses.

Introduction

Risk adjustment is an important component of outcomes and quality analysis in surgical health care. However, there are 2 major concerns regarding the collection of risk-adjustment data in practice today. First, data collection places an additional burden on physicians, who feel that they have become data-entry personnel and are spending more time in front of the computer than with the patient. This concern could be partially addressed by studies that aim to identify minimum data sets that require only a handful of variables. For example, previous work has determined that the ability of the American College of Surgeons National Surgical Quality Improvement Program (NSQIP) database to risk adjust may be adequate using as few as 4 variables.14 However, the more serious concern is that risk-adjusted quality benchmarking systems can be gamed because they include data elements that require subjective interpretation by hospital personnel, such as patient history of comorbidities. There may be subtle financial incentives to overcode these data elements to increase the risk scores for a hospital and, thus, improve a hospital’s risk-adjusted outcomes profile.

These concerns can be addressed if risk-adjustment models avoid subjective data elements, such as history of comorbidities, and rely on objective data, such as laboratory values or other machine-collected variables that do not require the subjective interpretation and input of hospital personnel. The adoption of electronic health records has doubled from 2009 to 2011, partly a result of funding provided by the Health Information Technology for Economic and Health Act of 2009.5 This has contributed to so-called big data, multiple massive data sets that include patient demographics, medical history, disease course with corresponding treatment, and transaction information, such as device usage and medication administration. However, much of these data are currently perceived as by-products of health care delivery rather than a central asset to improve its efficiency.5 Therefore, the aim of this study was to determine whether a number of machine-collected data elements could perform as well as a traditional, full risk-adjustment model that includes other physician-assessed and physician-recorded data elements. We tested this hypothesis with an analysis of the NSQIP database.

Methods

This research uses all available NSQIP data from January 1, 2005, to December 31, 2010. This nationally validated program measures more than 135 variables on each patient and follows up each patient for 30 days postoperatively. The 2005 to 2006 database included information from 121 hospitals while data from 2010 included information from 237 hospitals.6 This data set was chosen for its breadth of preoperative and postoperative variables collected for each patient. Owing to its retrospective nature, this study was exempt from institutional review board approval and participants did not provide consent.

The primary analysis included all patients in the database who were categorized as having had an operation performed by a general surgeons or surgeons in the following subspecialties: plastic surgery, cardiothoracic surgery, or vascular surgery. Subset analyses were performed in patients who underwent abdominal aortic aneurysm repair, aortic valve replacement, coronary artery bypass graft, esophagectomy, pancreatic resection, laparoscopic cholecystectomy, colectomy (total or partial, laparoscopic or open), appendectomy (open or laparoscopic), and ventral hernia repair. The first 5 procedures listed were selected because of presence of Leapfrog quality standards.7 The last 4 procedures were selected because they are common general surgery procedures.

Patients were determined to have an adverse event after surgery if they experienced death or at least 1 of the following complications as captured by NSQIP before discharge: superficial surgical site infection; deep incisional surgical site infection; organ space surgical site infection; wound disruption; pneumonia; unplanned intubation; pulmonary embolism; ventilator for more than 48 hours; progressive renal insufficiency; acute renal failure; urinary tract infection; stroke/cerebrovascular accident with neurological deficit; coma of more than 24 hours; peripheral nerve injury; cardiac arrest requiring cardiopulmonary resuscitation; myocardial infarction; bleeding requiring transfusions; graft, prosthesis, or flap failure; deep vein thrombosis or thrombophlebitis; sepsis; or septic shock.

Multivariate logistic regression models were created to predict either mortality or any complication in the inpatient setting or within 30 days of surgery. In one set of models we included all preoperative risk variables included in NSQIP, a total of 66 variables. In a second set of models we only included preoperative variables that were either laboratory data or able to be automatically extracted from a patient medical record without additional data input. This was a total of 25 variables, which are listed in the Box. All continuous variables were kept as such except for age, which was grouped into 10-year categories.

Box Section Ref ID
Box.

Preoperative National Surgical Quality Improvement Program Variables Classified According to Objectivity

  • Objective variables included in the analysis:

    • Age in 10-y categories

    • Body mass index

    • Pregnancy

    • Alkaline phosphatase

    • Blood urea nitrogen

    • Hematocrit

    • International normalized ratio of PT values

    • Platelet count

    • PT

    • PTT

    • Serum albumin

    • Serum creatinine

    • Serum sodium

    • SGOT

    • Systemic sepsis

    • Total bilirubin

    • White blood cell count

    • Principal anesthesia technique

    • Prior operation within 30 d

    • Race/ethnicity (white, black, Hispanic, Asian or Pacific Islander, American Indian, Alaska Native, or other)

    • Sex

    • Surgical specialty of surgeon performing procedure

    • Transfusion >4 U PRBCs in 72 h before surgery

  • Subjective variables not included in the analysis:

    • >10% Loss body weight in last 6 mo

    • Acute renal failure

    • Airway trauma

    • Alcohol, >2 drinks/d in 2 wk before admission

    • ASA classification

    • Ascites

    • Bleeding disorder

    • Chemotherapy for malignancy in ≤30 d prior to surgery

    • Coma >24 h

    • Congestive heart failure in 30 d prior to surgery

    • Current pneumonia

    • Current smoker within 1 y

    • Currently receiving dialysis (prior to operation)

    • Cerebrovascular accident/stroke with neurological deficit

    • Cerebrovascular accident/stroke with no neurological deficit

    • Diabetes mellitus with oral agents or insulin

    • Disseminated cancer

    • Do not resuscitate status

    • Dyspnea

    • Emergency case

    • Esophageal varices

    • Functional health status prior to current illness

    • Functional health status prior to surgery

    • Hemiplegia

    • History of angina in 1 mo prior to surgery

    • History of myocardial infarction 6 mo prior to surgery

    • History of revascularization/amputation for peripheral vascular disease

    • History of severe COPD

    • History of transient ischemic attacks

    • Hypertension requiring medication

    • Impaired sensorium

    • Mallampati scale

    • Paraplegia

    • Previous cardiac surgery

    • Previous percutaneous coronary intervention

    • Quadriplegia

    • Radiotherapy for malignancy in last 90 d

    • Rest pain/gangrene

    • Steroid use for chronic condition

    • Tumor involving CNS

    • Wound classification

    • Wound infection

Abbreviations: ASA, American Society of Anesthesiologists; CNS, central nervous system; COPD, chronic obstructive pulmonary disease; PRBCs, packed red blood cells; PT, prothrombin time; PTT, partial thromboplastin time; SGOT, serum glutamic oxaloacetic transaminase.

We compared the area under the receiver operating characteristic curve (AUC) of each regression using objective preoperative risk variables to its corresponding regression with all variables. The AUC is a discriminative measure to identify how well a model separates 2 groups (ie, patients with vs without adverse events). An AUC value of 0.5 indicates that the model separates the 2 groups no better than chance whereas an AUC value of 1.0 indicates that the model completely separates the 2 groups. The AUC statistic is actually the percentage of randomly selected pairs correctly predicted by the model. Thus, the AUC allows us to determine which model can more accurately discriminate between the 2 groups of interest.811

Statistical analysis was performed using Stata 64-bit special edition, version 11.2 (StataCorp).

Results

Data from a total of 745 053 patients from 2005 to 2010 from the NSQIP database were included (Table 1). More than 15.8% of patients had at least 1 complication and the mortality rate was 2.8%. Mean age was higher among those with complications (61.5 years) and highest among those who died (71.1 years). Among the subset procedures, colectomy had the highest percentage of patients (12.7%). The fewest number of patients received aortic valve replacement (0.2%) and esophagectomy (0.3%). Overall complication rates were highest in esophagectomy (49.5%) and lowest in laparoscopic cholecystectomy (5.2%) and appendectomy (6.6%). Mortality rates were highest in colectomy (4.4%) and aortic valve replacement (4.3%).

The AUC was slightly higher in models that included all variables compared with models with only objective variables. Among all patients, the AUC was 0.9005 with all variables vs 0.8774 with objective variables for 30-day mortality and 0.9078 with all variables vs 0.8881 with objective variables for inpatient mortality (Table 2). When examining complications among all patients, the AUC was 0.7401 with all variables vs 0.7137 with objective variables for 30-day complications and 0.7859 with all variables vs 0.7609 with objective variables for inpatient complications (Table 2).

The difference in AUC ranged from −0.0073 to 0.1944 for mortality and 0.0198 to 0.0687 for complications. In models predicting mortality, the difference in AUC was less than 0.05 among all patients and subsets of patients with abdominal aortic aneurysm repair, pancreatic resection, colectomy, and appendectomy. The difference in AUC was greatest among patients with esophagectomy (0.1937 to 0.1944) and smallest among patients with appendectomy (−0.0073 to 0.0171). In models predicting complications, the difference in AUC was less than 0.05 among all patients and subsets of patients with pancreatic resection, laparoscopic cholecystectomy, colectomy, and appendectomy. The difference in AUC was greatest among patients with aortic valve replacement (0.0552 to 0.0687) and smallest among patients with pancreatic resection (0.0198 to 0.0204).

Discussion

These data suggest that it is possible to create a risk-adjustment system with high discriminatory value based only on objective variables. In this study, there was generally minimal difference in AUC when using all preoperative risk variables compared with using only objective variables. The difference in AUC was lower when examining complications than mortality, although the range of difference in AUC values differed by procedure. Thus, this method may be best to use when wanting to perform risk-adjusted analyses for complications, although the implications of this research are broad.

By restricting data collection to objective data, we can reduce concerns about reliability and validity as well as threats of gaming the system from attempting to increase the risk score of patients through subjective variables. Restricting risk-adjusted analyses to objective variables also improves efficiency and reduces cost of data collection. Clinicians would no longer have to deliberate and decide on patient functional status or the surgical wound classification and then take the time to enter this information in the medical record. Data collectors would also not have to scour the medical record to identify various comorbidities. Instead, basic laboratory values already ordered for direct patient care could be extracted from electronic medical records. For example, instead of having to input a separate code for a diagnosis of diabetes, a hemoglobin A1C value combined with prescriptions for insulin or oral hypoglycemic agents would provide enough information on this comorbidity.

This research also generates opportunities for wider participation in surgical quality assessment and improvement programs. Currently, 461 US hospitals and 36 hospitals in Australia, Austria, Canada, Lebanon, the United Kingdom, and Saudi Arabia participate in the American College of Surgeons NSQIP.12 Yet, these are mostly high-volume hospitals, with very limited participation by smaller community hospitals. With the advent of electronic health records in all hospitals, it is not unrealistic to propose complete participation of all US hospitals in a national risk-adjusted program managed through an automated system.

This concept has even wider implications. As the amount of data and number of databases continue to expand both within health care and the private sector, new opportunities for sophisticated and novel uses of information abound. In addition to the growing amount of data collected by electronic health records, the private sector has a plethora of databases that could be linked to patient data in innovative ways, from tracking purchasing patterns of healthy food or over-the-counter medicines to using global positioning system technology (now embedded in many personal handheld devices) to record patient exercise regimens or geographical disease hot spots. In our data-driven environment in which technology continues to become an integral part of our daily lives, the possibilities are endless. In the future, it is not inconceivable that we will be able to perform retrospective analyses of complex databases with the same rigor as prospective trials. Hiring additional personnel for the sole purpose of data collection may become outdated in the era of big data.

This research was not without limitations. We narrowed down all NSQIP preoperative variables to a subset of objective variables but this list was not without flaws. Despite our attempts to be as strict as possible, we recognize that this list is still open to interpretation. For example, we classified more than 10% loss of body weight in the last 6 months as subjective because this may not always be accurately measured and recorded. Other NSQIP variables record clearly defined medical history or recent treatment given, such as chemotherapy for malignancy in the last 30 days prior to operation, but these variables may not always be easy to obtain or completely accurate because they may depend on outside records or history per the patient. Furthermore, many patients do not receive all preoperative laboratory tests, and thus, these missing data were excluded from these analyses. We accounted for this by using a large sample size including subset analyses of several different operations. Our findings are also consistent with the literature. For example, the predictive value of albumin in surgical outcomes has been well studied.13 Many of these variables have also found to be significant in other studies examining particular operations and correspond highly with our own previous research.1,2

Conclusions

Rigorous risk-adjusted surgical quality assessment can be performed relying solely on objective, automated variables. By leveraging data that are already regularly collected for patient care, this approach is a more efficient use of the massive amounts of information available in the age of big data. This approach also addresses common concerns about the burden of data collection and the validity and reliability of data elements and can lead to wider adoption of quality assessment systems in health care. Identifying new ways to automate data collection using existing technology can make further improvements to the fields of surgical outcomes research and quality assessment and improvement.

Back to top
Article Information

Corresponding Author: Jamie E. Anderson, MD, MPH, Center for Surgical Systems and Public Health, Department of Surgery, University of California, San Diego, 200 W Arbor Dr, San Diego, CA 92103 (jaa002@ucsd.edu).

Accepted for Publication: April 3, 2014.

Published Online: November 5, 2014. doi:10.1001/jamasurg.2014.947.

Author Contributions: Drs Chang and Anderson had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Both authors.

Acquisition, analysis, or interpretation of data: Chang.

Drafting of the manuscript: Anderson.

Critical revision of the manuscript for important intellectual content: Both authors.

Statistical analysis: Both authors.

Administrative, technical, or material support: Chang.

Study supervision: Chang.

Conflict of Interest Disclosures: None reported.

Disclaimer: The American College of Surgeons National Surgical Quality Improvement Program and its participating hospitals are the source of data used in this research; they have not verified and are not responsible for the statistical validity of the data analysis or conclusions of the authors.

References
1.
Anderson  JE, Lassiter  R, Bickler  SW, Talamini  MA, Chang  DC.  Brief tool to measure risk-adjusted surgical outcomes in resource-limited hospitals. Arch Surg. 2012;147(9):798-803.
PubMedArticle
2.
Dimick  JB, Osborne  NH, Hall  BL, Ko  CY, Birkmeyer  JD.  Risk adjustment for comparing hospital quality with surgery: how many variables are needed? J Am Coll Surg. 2010;210(4):503-508.
PubMedArticle
3.
Rubinfeld  I, Farooq  M, Velanovich  V,  et al.  Predicting surgical risk: how much data is enough? AMIA Annu Symp Proc. 2010;777-781.
PubMed
4.
Birkmeyer  JD, Shahian  DM, Dimick  JB,  et al.  Blueprint for a new American College of Surgeons: National Surgical Quality Improvement Program. J Am Coll Surg. 2008;207(5):777-782.
PubMedArticle
5.
Murdoch  TB, Detsky  AS.  The inevitable application of big data to health care. JAMA. 2013;309(13):1351-1352.
PubMedArticle
6.
American College of Surgeons National Surgical Quality Improvement Program. User Guide for the 2010 Participant Use Data File. American College of Surgeons; Chicago, IL: 2012.
7.
The Leapfrog Hospital Survey Reference Book: Supporting Documentation for the 2013 Leapfrog Hospital Survey. https://leapfroghospitalsurvey.org/web/wp-content/uploads/reference.pdf. Accessed May 14, 2013.
8.
Healey  C, Osler  TM, Rogers  FB,  et al.  Improving the Glasgow Coma Scale score: motor score alone is a better predictor. J Trauma. 2003;54(4):671-678, discussion 678-680.
PubMedArticle
9.
Bewick  V, Cheek  L, Ball  J.  Statistics review 13: receiver operating characteristic curves. Crit Care. 2004;8(6):508-512.
PubMedArticle
10.
Meredith  JW, Evans  G, Kilgo  PD,  et al.  A comparison of the abilities of nine scoring algorithms in predicting mortality. J Trauma. 2002;53(4):621-628, discussion 628-629.
PubMedArticle
11.
Connell  FA, Koepsell  TD.  Measures of gain in certainty from a diagnostic test. Am J Epidemiol. 1985;121(5):744-753.
PubMedArticle
12.
American College of Surgeons National Surgical Quality Improvement Program. http://site.acsnsqip.org/participants/. Accessed September 23, 2013.
13.
Gibbs  J, Cull  W, Henderson  W, Daley  J, Hur  K, Khuri  SF.  Preoperative serum albumin level as a predictor of operative mortality and morbidity: results from the National VA Surgical Risk Study. Arch Surg. 1999;134(1):36-42.
PubMedArticle
×