Comparison of adjusted rates of adverse outcomes in Keystone hospitals before and after implementation of the Keystone Surgery Program.
Reames BN, Krell RW, Campbell DA, Dimick JB. A Checklist-Based Intervention to Improve Surgical Outcomes in MichiganEvaluation of the Keystone Surgery Program. JAMA Surg. 2015;150(3):208–215. doi:10.1001/jamasurg.2014.2873
Previous studies of checklist-based quality improvement interventions have reported mixed results.
To evaluate whether implementation of a checklist-based quality improvement intervention—Keystone Surgery—was associated with improved outcomes in patients in a large statewide population undergoing general surgery.
Design, Setting, and Exposures
A retrospective longitudinal study examined surgical outcomes in 64 891 Michigan patients in 29 hospitals using Michigan Surgical Quality Collaborative clinical registry data from 2006 through 2010. Multivariable logistic regression and difference-in-differences analytic approaches were used to evaluate whether Keystone Surgery program implementation was associated with improved surgical outcomes following general surgery procedures, apart from existing temporal trends toward improved outcomes during the study period.
Main Outcomes and Measures
Risk-adjusted rates of superficial surgical site infection, wound complication, any complication, and 30-day mortality.
Implementation of Keystone Surgery in 14 participating centers was not associated with improvements in surgical outcomes during the study period. Adjusted rates of superficial surgical site infection (3.2% vs 3.2%, P = .91), wound complication (5.9% vs 6.5%, P = .30), any complication (12.4% vs 13.2%, P = .26), and 30-day mortality (2.1% vs 1.9%, P = .32) at participating hospitals were similar before and after implementation. Difference-in-differences analysis accounting for trends in 15 nonparticipating centers and sensitivity analysis excluding patients receiving surgery in the first 6 or 12 months after program implementation yielded similar results.
Conclusions and Relevance
Implementation of a checklist-based quality improvement intervention did not affect rates of adverse surgical outcomes among patients undergoing general surgery in participating Michigan hospitals. Additional research is needed to understand why this program was not successful prior to further dissemination and implementation of this model to other populations.
There is widespread enthusiasm for the use of checklists to improve hospital outcomes.1- 4 Perhaps one of the most widely known and successful examples is the Keystone ICU (Intensive Care Unit) Patient Safety Program. This intervention used a checklist emphasizing evidence-based processes of care and a program to improve safety culture5 to dramatically decrease rates of catheter-related bloodstream infection6 and ventilator-associated pneumonia7 in Michigan. The program has since been implemented nationally,8 and similar programs have expanded to other patient populations. One such expansion was Keystone Surgery, a Michigan program designed to reduce rates of surgical site infection and other adverse surgical outcomes.9
The effectiveness of checklist-based interventions to improve surgical outcomes is still unclear, however. Recent work by Urbach and colleagues10 failed to report an association between implementation of surgical safety checklists and improved outcomes in a large population. Previous evaluations of programs directed toward surgical site infection specifically have been limited to small cohorts and single institutions,11- 15 and no studies have used a concurrent control group to assess effectiveness.16 Although previous studies have demonstrated that the Surgical Care Improvement Program (SCIP) process measures used in Keystone Surgery are not associated with improved outcomes,17,18 none have evaluated these processes when coupled with a program to improve safety culture. Given the substantial resources necessary to implement interventions like Keystone Surgery, evidence evaluating effectiveness is essential prior to broader dissemination.
In this study, we capitalize on a unique natural experiment to evaluate the effect of Keystone Surgery on general surgery outcomes in a large statewide population. We used 5 years of clinical registry data to examine outcomes before and after implementation of the Keystone Surgery program. To account for secular trends in the state, we compared this cohort with a control group of patients undergoing surgery during the same period at Michigan hospitals that did not implement the program.
This study was completed using clinical registry data from the Michigan Surgical Quality Collaborative (MSQC), a regional consortium of 52 hospitals funded by Blue Cross and Blue Shield of Michigan and The Blue Care Network. Details of data collection have been previously published.19,20 Clinical nurse reviewers collect data on patient characteristics, intraoperative processes, and 30-day outcomes for patients undergoing general and specialty surgery throughout the state, using a standard 8-day case sampling strategy.21 Annual nurse reviewer and data audits ensure data accuracy. For this study, we identified all patients undergoing general surgery procedures at MSQC hospitals from 2006 through 2010 using relevant Current Procedural Terminology codes. We chose inpatient procedures that account for the vast majority of postoperative infections, including abdominal exploration and lysis of adhesions, cholecystectomy, appendectomy, colorectal resection, ventral hernia repair, bariatric surgery, pancreatic resection, esophagectomy, gastrectomy, fundoplication, peptic ulcer surgery, liver resection, biliary reconstruction, pelvic exenteration, small-bowel operations, and splenectomy.
The Keystone Surgery program was a prospective cohort intervention implemented within specialty-specific surgical teams at participating Michigan Health & Hospital Association hospitals with a goal of improving surgical care throughout the state. Hospitals volunteered to participate and did not receive financial support. Implementation occurred during a 2-year period using a stepped-wedge design.22 Most Michigan Health & Hospital Association hospitals (n = 76) enrolled during April 2008, while a second group (n = 25) enrolled in April 2009. Within each hospital, a surgeon, anesthesiologist, and operating room nurse were designated as operative team leaders. Throughout the program, monthly coaching calls and semiannual collaborative meetings were used to support the implementation process.
Similar to the Keystone ICU program, the Keystone Surgery program used 2 principal components (Table 1): a novel model to translate evidence into practice23 and the Comprehensive Unit-based Safety Program to improve safety culture.24 The evidence-based practice component used a checklist tool that focused on compliance with 6 Centers for Medicare & Medicaid Services SCIP processes: appropriate prophylactic antibiotic use (selection, timing, and discontinuation), appropriate hair removal, maintenance of perioperative normothermia, and glucose control.25- 28 At the start of implementation, operative teams were provided with supporting materials and references to educate staff. Throughout the program, teams were encouraged to implement the tool during briefings and debriefings surrounding every procedure and to monitor compliance, adapt the tool based on local needs, and work together to resolve issues that surfaced during the process.29,30
The Comprehensive Unit-based Safety Program is an iterative 5-step process previously validated to improve teamwork and safety culture (Table 1).5,24 Through these steps, the program attempts to educate participants on the principles of safety science, identify defects, increase communication between frontline health care professionals and senior leadership, encourage learning from identified defects, and implement tools to assist the quality improvement process. At initiation and annually thereafter, a validated assessment of culture was performed to support and guide the program.31
Within the 34 MSQC study hospitals, 10 hospitals implemented the program before May 1, 2008, 2 hospitals implemented the program on June 1, 2009, 1 on December 1, 2009, and 1 on January 1, 2010. Fifteen hospitals did not implement the program, and 5 hospitals that implemented Keystone Surgery before joining MSQC were excluded. For this analysis, hospitals were divided into 2 groups: hospitals that implemented the Keystone Surgery program (Keystone hospitals) and those that did not (non-Keystone hospitals). Patients undergoing a procedure at Keystone hospitals before the specified date of implementation were considered pre-implementation, while patients undergoing a procedure after that date were considered post-implementation. Because most Keystone hospitals implemented the program on or before May 1, 2008, patients undergoing procedures at non-Keystone hospitals were considered post-implementation if they underwent surgery after May 1, 2008.
The primary outcomes of this analysis included superficial surgical site infection, any wound complication (superficial, deep, or organ-space surgical site infection or wound disruption), any complication, and death within 30 days of operation. Additional complications recorded in the MSQC registry include acute kidney injury, intraoperative or postoperative transfusion, cardiac arrest requiring resuscitation, coma lasting more than 24 hours, superficial or deep venous thromboembolism, myocardial infarction, prolonged ventilation lasting more than 48 hours, peripheral nerve injury, pneumonia, pulmonary embolism, renal insufficiency, stroke, sepsis, septic shock, unplanned intubation, and urinary tract infection.
We performed 2 distinct analyses to evaluate the effect of Keystone Surgery on surgical outcomes: a pre-post analysis and a difference-in-differences analysis. For the pre-post analysis, we assessed patients undergoing surgery at hospitals that implemented Keystone Surgery. We used multivariable logistic regression to evaluate the relationship between our primary outcomes and program implementation. Each model included a variable indicating whether patients at Keystone hospitals had surgery before program implementation (pre-implementation) or after (post-implementation). To further evaluate the effect of the program on outcomes following specific procedures, we stratified analyses by the 4 most common operations: cholecystectomy, colorectal resection, appendectomy, and ventral hernia repair.
We then used a difference-in-differences analysis to account for coincident temporal trends toward improved outcomes among all study hospitals. This econometric technique, frequently used to evaluate the effect of policy changes,32- 34 uses a control group to isolate changes in outcomes associated with an intervention apart from changes observed in the control group. Our control group included MSQC hospitals that did not participate in Keystone Surgery, as they were exposed to all factors driving improved outcomes during the period except the intervention. In addition to the post-implementation variable, this model included a dichotomous variable indicating whether the hospital implemented Keystone Surgery, as well as the interaction of this variable and the post-implementation variable. The odds ratio (OR) from this interaction term (ie, the difference-in-differences estimator) can be interpreted as the independent effect of Keystone Surgery implementation on surgical outcomes.35,36
In all models, we adjusted for patient characteristics, comorbidities, and details of the procedure. Patient characteristics included age, sex, race/ethnicity, and their interactions. Comorbidities included American Society of Anesthesiologists class, diabetes mellitus, smoking status, dyspnea, do-not-resuscitate status, functional status, chronic obstructive pulmonary disease, pneumonia, congestive heart failure, hemodialysis, hemiplegia, transient ischemic attack, disseminated cancer, prior myocardial infarction, angina, hypertension requiring medication, peripheral vascular disease, prior operations, and impaired sensorium. Procedure details included emergency status, operative approach, and procedure type. To account for within-hospital outcome correlation (clustering), we generated robust standard errors.
We performed sensitivity analyses to account for variability in program compliance during the initial phase of implementation. For the Keystone ICU Patient Safety program, investigators estimated that implementation would take less than 6 months.6 Therefore, we performed 2 pre-post analyses after excluding patients who underwent surgery during the first 6 or 12 months following program implementation.
Risk-adjusted outcome rates were determined by calculating marginal effects for each model. C statistics for the models ranged from 0.74 (surgical site infection) to 0.95 (30-day mortality). For all statistical tests, P values are 2-tailed, with α = .05. All analyses were performed using STATA, version 12.1 (StataCorp LP). This study was not regulated by the University of Michigan Institutional Review Board.
Our study cohort consisted of 64 891 patients in 29 hospitals. Fourteen hospitals implemented Keystone Surgery during the study period. In these hospitals, 14 005 patients underwent surgery before program implementation and 14 801 patients underwent surgery after program implementation. A total of 36 085 patients underwent surgery at non-Keystone hospitals. Patient and operative characteristics are shown in Table 2. In Keystone hospitals, patients undergoing surgery before and after implementation were generally similar in all characteristics and comorbidities. Small differences were present in the proportion of female patients, African American patients, and emergency procedures. Patients undergoing surgery at non-Keystone hospitals were also generally similar to those undergoing surgery at Keystone hospitals across categories (Table 2).
Risk-adjusted outcomes in the 14 Keystone hospitals before and after program implementation are shown in the Figure. No significant differences were seen in adjusted rates of any adverse outcome before vs after implementation of Keystone Surgery: 30-day mortality (2.1% vs 1.9%, P = .32), superficial surgical site infection (3.2% vs 3.2%, P = .91), wound complication (5.9% vs 6.5%, P = .30), and any complication (12.4% vs 13.2%, P = .26). Similarly, most adjusted adverse outcome rates did not differ significantly before or after Keystone Surgery implementation when stratified by procedure type (Table 3).
Table 4 shows the odds of adverse outcomes before and after implementation of the Keystone Surgery program in participating hospitals. No association was present between adjusted odds of adverse outcomes and Keystone Surgery implementation: 30-day mortality (OR, 0.88; 95% CI, 0.68-1.14), superficial surgical site infection (OR, 1.02; 95% CI, 0.76-1.36), wound complication (OR, 1.12; 95% CI, 0.90-1.40), or any complication (OR, 1.09; 95% CI, 0.94-1.27). Difference-in-differences models accounting for competing time trends during the study period did not change these results (Table 4). Sensitivity analysis excluding patients undergoing surgery within 6 or 12 months of the start of Keystone Surgery implementation also yielded similar results.
In this study, we performed a controlled evaluation of a checklist-based quality improvement intervention—Keystone Surgery—that focused on reducing surgical site infections. We were unable to find a significant association between program implementation and adjusted rates of superficial surgical site infection, wound complication, any complication, and 30-day mortality in patients undergoing general surgery in participating hospitals. This finding was robust across multiple analyses, including a difference-in-differences analysis, stratified analyses of the most common operations, and sensitivity analyses excluding patients undergoing surgery in the first 6 or 12 months after program implementation.
Previous studies evaluating checklist-based interventions have reported mixed results.11- 13,15 For example, Hedrick et al11 reported a decrease in surgical site infection rates at a single institution from 25.6% to 15.9% over 2 years following implementation of a checklist-based intervention in patients undergoing colorectal surgery, while Forbes and colleagues12 reported a nonsignificant decrease in rates following a similar intervention and Anthony et al13 reported an increase in rates following a single-center randomized trial. While a recent meta-analysis examining the effects of the World Health Organization surgical safety checklist reported a significant association with improved outcomes,16 the individual studies reviewed were heterogeneous and reported widely mixed results. Furthermore, no studies examined in the meta-analysis used a control group to isolate the effect of the checklist from coincident secular trends toward improved outcomes.
Our study goes beyond this current literature in several important ways. First, we evaluated a diverse statewide population of patients undergoing surgery in many hospitals representing diverse sizes, teaching statuses, and affiliations. Second, our analysis included a control group of hospitals not participating in the program to isolate the effect of the intervention from secular trends toward improved outcomes during the study period. When compared with previous work, these findings highlight the importance of accurate risk adjustment and control cohorts when evaluating effectiveness and are corroborated by similar evaluations of other programs. Benning and colleagues,37 for example, found that, although care improved in UK hospitals during the Health Foundation’s Safer Patients Initiative, there was no additional effect beyond that seen in control hospitals.
This study has multiple limitations. First, use of data from the MSQC limits the study cohort to a subset of Michigan hospitals participating in a statewide organization collaborating for quality improvement. Although use of this clinical registry allowed rigorous adjustment of patient characteristics, it precluded evaluation of the Keystone Surgery program in other Michigan Health & Hospital Association hospitals not participating in the collaborative. Nevertheless, MSQC hospitals provide the vast majority of surgical care delivered in Michigan.38 Furthermore, no systematic prospective quality improvement initiatives targeting superficial surgical site infections beyond outcomes measurement and feedback were implemented during the study period. Second, because we lack detail regarding program compliance at individual hospitals, we cannot explain why the program did not improve outcomes in these hospitals. Finally, hospitals participating in Keystone Surgery volunteered to participate, which may have introduced selection bias. However, these latter limitations do not reduce the internal validity of this study, as this analysis was not designed to evaluate the details of implementation but instead to examine program effectiveness as it was implemented.
Ultimately, a more comprehensive evaluation will be necessary to understand why Keystone Surgery failed to affect surgical outcomes. There are 2 possible reasons the program did not have its intended effect. First, Keystone Surgery may have failed because it encouraged adherence to processes that are not strongly associated with outcomes.17,18,39 However, this study adds to the current literature on SCIP measures by showing that the addition of a previously validated process to improve teamwork and safety culture (Comprehensive Unit-based Safety Program) to SCIP measure processes did not enhance their effectiveness.
A second possible explanation is a failure of the implementation process. Successful implementation of clinical interventions depends not only on high-quality evidence but also a receptive environment and facilitation.40 The Keystone ICU program, for example, was implemented in ICUs, used small teams of nurses and advanced health care professionals, and focused on a single procedure. In contrast, the Keystone Surgery program was implemented in the operating room on a heterogeneous group of complex procedures and engaged diverse teams that underwent frequent personnel changes. It would not be surprising if this increased complexity created an environment less conducive to successful implementation. Moreover, surgical site infections are diverse and complex complications, and it is less likely that a single bundle of interventions can be successfully applied across organizations. Another notable difference between the Keystone Surgery and Keystone ICU programs was that, for Keystone Surgery, many participating sites lacked infrastructure for data collection and outcome feedback to frontline teams—a key attribute of successful improvement efforts.41
Regardless of the underlying cause, the lack of effectiveness observed following Keystone Surgery implementation requires health care professionals to reevaluate how such interventions are designed. Lessons learned from use of the Comprehensive Unit-based Safety Program in Keystone Surgery were subsequently incorporated into the development of a similar program—Surgical Comprehensive Unit-based Safety Program—that was successfully implemented at a single institution in July 2011.42 First, instead of using SCIP process measures, researchers used input from health care professionals to identify local defects with the greatest potential to prevent surgical site infections. Second, during the program, process measures were objectively audited (eg, postoperative temperature was measured to ensure patients were normothermic) rather than given credit for process compliance per se (eg, the current SCIP temperature control measure gives credit for use of a warming blanket regardless of a patient’s temperature). Focused efforts to address and mitigate local defects were associated with reductions in surgical site infection rates following colorectal surgery from 27.3% to 18.2% during a 2-year period.
In this study, we evaluated a checklist-based quality improvement intervention focused on a statewide population of surgical patients. We found that the Keystone Surgery program was not associated with improvements in adjusted rates of adverse outcomes regardless of the cohort evaluated or the methods used. It is unclear whether this outcome was due to a failure of evidence or implementation. Although reasons are likely multifactorial, the experience gained through completion of Keystone Surgery resulted in valuable lessons for implementation of future programs. Going forward, evaluations of similar programs should incorporate both quantitative and qualitative methods to better understand how implementation influences outcomes.43 Given the resources necessary to widely implement programs like Keystone Surgery, it is essential that researchers assess clinical effectiveness before broad dissemination. This study illustrates that success of a program in one clinical context may not translate to others. Instead, each program must be evaluated individually to determine its true clinical effectiveness.
Accepted for Publication: May 27, 2014.
Corresponding Author: Bradley N. Reames, MD, MS, Center for Healthcare Outcomes and Policy, University of Michigan, 2800 Plymouth Rd, Bldg 16, Room 100N-08, Ann Arbor, MI 48109 (firstname.lastname@example.org).
Published Online: January 14, 2015. doi:10.1001/jamasurg.2014.2873.
Author Contributions: Dr Reames had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Reames, Dimick.
Acquisition, analysis, or interpretation of data: Reames, Krell, Campbell.
Drafting of the manuscript: Reames, Krell, Dimick.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Reames, Krell.
Administrative, technical, or material support: Campbell.
Study supervision: Dimick.
Conflict of Interest Disclosures: Dr Dimick reports serving as consultant and having an equity interest in ArborMetrix Inc, which provided software and analytics for measuring hospital quality and efficiency. The company had no role in the study. Dr Campbell is program director for the Michigan Surgical Quality Collaborative. No other disclosures were reported.
Funding/Support: Dr Reames is supported by grant 5T32CA009672-23 from the National Cancer Institute. Dr Dimick is supported by a grant from Blue Cross/Blue Shield of Michigan Foundation.
Role of the Funder/Sponsor: The funding sources had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Disclaimer: Dr Dimick is the Surgical Innovation Editor for JAMA Surgery but was not involved in the editorial review or decision process.