[Skip to Navigation]
Sign In
Figure.  Hospital 2017 CPOE Performance Score and Improvement in 2018
Hospital 2017 CPOE Performance Score and Improvement in 2018

Dots indicate mean scores; whiskers, 95% CIs; vertical line, cutoff point of 50% between hospitals that received the Full Demonstration and Substantial Demonstration feedback in 2017.

Table 1.  Sample Characteristics
Sample Characteristics
Table 2.  Impact of Feedback on Hospital Performance Improvement
Impact of Feedback on Hospital Performance Improvement
Table 3.  Improvement Differences Across “Basic” and “Advanced” Decision Support Categories
Improvement Differences Across “Basic” and “Advanced” Decision Support Categories
1.
Blumenthal  D.  Launching HITECH.   N Engl J Med. 2010;362(5):382-385. doi:10.1056/NEJMp0912825 PubMedGoogle ScholarCrossref
2.
Corrigan  JM. Crossing the quality chasm. In: Reid  PP, Compton  WD, Grossman  JH,  et al, eds.  Building a Better Delivery System: A New Engineering/Health Care Partnership. National Academies Press; 2005.
3.
Halamka  JD, Tripathi  M.  The HITECH era in retrospect.   N Engl J Med. 2017;377(10):907-909. doi:10.1056/NEJMp1709851 PubMedGoogle ScholarCrossref
4.
Bates  DW.  Preventing medication errors: a summary.   Am J Health Syst Pharm. 2007;64(14)(suppl 9):S3-S9. doi:10.2146/ajhp070190 PubMedGoogle Scholar
5.
Bates  DW, Leape  LL, Cullen  DJ,  et al.  Effect of computerized physician order entry and a team intervention on prevention of serious medication errors.   JAMA. 1998;280(15):1311-1316. doi:10.1001/jama.280.15.1311 PubMedGoogle ScholarCrossref
6.
Holmgren  AJ, Co  Z, Newmark  L, Danforth  M, Classen  D, Bates  D.  Assessing the safety of electronic health records: a national longitudinal study of medication-related decision support.   BMJ Qual Saf. 2020;29(1):52-59. doi:10.1136/bmjqs-2019-009609PubMedGoogle ScholarCrossref
7.
Kuperman  GJ, Bobb  A, Payne  TH,  et al.  Medication-related clinical decision support in computerized provider order entry systems: a review.   J Am Med Inform Assoc. 2007;14(1):29-40. doi:10.1197/jamia.M2170 PubMedGoogle ScholarCrossref
8.
Classen  DC, Holmgren  AJ, Co  Z,  et al.  National trends in the safety performance of electronic health record systems from 2009 to 2018.   JAMA Netw Open. 2020;3(5):e205547. doi:10.1001/jamanetworkopen.2020.5547PubMedGoogle Scholar
9.
Chaparro  JD, Classen  DC, Danforth  M, Stockwell  DC, Longhurst  CA.  National trends in safety performance of electronic health record systems in children’s hospitals.   J Am Med Inform Assoc. 2017;24(2):268-274. doi:10.1093/jamia/ocw134 PubMedGoogle ScholarCrossref
10.
Dranove  D, Satterthwaite  MA.  Monopolistic competition when price and quality are imperfectly observable.   RAND J Econ. 1992;23(4):518-534. doi:10.2307/2555903 Google ScholarCrossref
11.
Werner  RM, Bradlow  ET.  Relationship between Medicare’s hospital compare performance measures and mortality rates.   JAMA. 2006;296(22):2694-2702. doi:10.1001/jama.296.22.2694 PubMedGoogle ScholarCrossref
12.
Clarke  CA, Asch  SM, Baker  L,  et al.  Public reporting of hospital-level cancer surgical volumes in California: an opportunity to inform decision making and improve quality.   J Oncol Pract. 2016;12(10):e944-e948. doi:10.1200/JOP.2016.010819 PubMedGoogle ScholarCrossref
13.
Bardach  NS, Hibbard  JH, Greaves  F, Dudley  RA.  Sources of traffic and visitors’ preferences regarding online public reports of quality: web analytics and online survey results.   J Med Internet Res. 2015;17(5):e102-e102. doi:10.2196/jmir.3637 PubMedGoogle ScholarCrossref
14.
Lam  MB, Figueroa  JF, Feyman  Y, Reimold  KE, Orav  EJ, Jha  AK.  Association between patient outcomes and accreditation in US hospitals: observational study.   BMJ. 2018;363:k4011. doi:10.1136/bmj.k4011PubMedGoogle Scholar
15.
Wan  W, Liang  CJ, Duszak  R  Jr, Lee  CI.  Impact of teaching intensity and sociodemographic characteristics on CMS Hospital Compare quality ratings.   J Gen Intern Med. 2018;33(8):1221-1223. doi:10.1007/s11606-018-4442-6 PubMedGoogle ScholarCrossref
16.
Fahrenbach  J, Chin  MH, Huang  ES, Springman  MK, Weber  SG, Tung  EL.  Neighborhood disadvantage and hospital quality ratings in the Medicare Hospital Compare program.   Med Care. 2020;58(4):376-383. doi:10.1097/MLR.0000000000001283 PubMedGoogle ScholarCrossref
17.
Dranove  D, Kessler  D, McClellan  M, Satterthwaite  M.  Is more information better: the effects of “report cards” on health care providers.   J Polit Econ. 2003;111(3):555-588. doi:10.1086/374180Google ScholarCrossref
18.
Lindenauer  PK, Remus  D, Roman  S,  et al.  Public reporting and pay for performance in hospital quality improvement.   N Engl J Med. 2007;356(5):486-496. doi:10.1056/NEJMsa064964PubMedGoogle ScholarCrossref
19.
Bogh  SB, Falstie-Jensen  AM, Hollnagel  E, Holst  R, Braithwaite  J, Johnsen  SP.  Improvement in quality of hospital care during accreditation: a nationwide stepped-wedge study.   Int J Qual Health Care. 2016;28(6):715-720. doi:10.1093/intqhc/mzw099 PubMedGoogle Scholar
20.
Werner  RM, Kolstad  JT, Stuart  EA, Polsky  D.  The effect of pay-for-performance in hospitals: lessons for quality improvement.   Health Aff (Millwood). 2011;30(4):690-698. doi:10.1377/hlthaff.2010.1277 PubMedGoogle ScholarCrossref
21.
Gupta  A.  Impacts of performance pay for hospitals: the Readmissions Reduction Program.   SSRN Journal. Published online 2017. doi:10.2139/ssrn.3054172 Google Scholar
22.
The Leapfrog Group. Survey overview. Accessed February 24, 2019. https://www.leapfroggroup.org/survey-materials/survey-overview
23.
Jha  AK, Orav  EJ, Ridgway  AB, Zheng  J, Epstein  AM.  Does the Leapfrog program help identify high-quality hospitals?   Jt Comm J Qual Patient Saf. 2008;34(6):318-325. doi:10.1016/S1553-7250(08)34040-9 PubMedGoogle Scholar
24.
Leung  AA, Keohane  C, Lipsitz  S,  et al.  Relationship between medication event rates and the Leapfrog computerized physician order entry evaluation tool.   J Am Med Inform Assoc. 2013;20(e1):e85-e90. doi:10.1136/amiajnl-2012-001549 PubMedGoogle ScholarCrossref
25.
Metzger  J, Welebob  E, Bates  DW, Lipsitz  S, Classen  DC.  Mixed results in the safety performance of computerized physician order entry.   Health Aff (Millwood). 2010;29(4):655-663. doi:10.1377/hlthaff.2010.0160 PubMedGoogle ScholarCrossref
26.
Co  Z, Holmgren  AJ, Classen  DC,  et al.  The tradeoffs between safety and alert fatigue: data from a national evaluation of hospital medication-related clinical decision support.   J Am Med Inform Assoc. 2020;27(8):1252-1258. doi:10.1093/jamia/ocaa098PubMedGoogle ScholarCrossref
27.
Maciejewski  ML, Basu  A.  Regression discontinuity design.   JAMA. 2020;324(4):381-382. doi:10.1001/jama.2020.3822 PubMedGoogle ScholarCrossref
28.
Guduguntla  V, McWilliams  JM.  Exploiting clinical decision-making thresholds to recover causal effects from observational data: randomization without trials.   JAMA Intern Med. 2021;181(6):774-775. doi:10.1001/jamainternmed.2021.0923 PubMedGoogle ScholarCrossref
29.
Moscoe  E, Bor  J, Bärnighausen  T.  Regression discontinuity designs are underutilized in medicine, epidemiology, and public health: a review of current and best practice.   J Clin Epidemiol. 2015;68(2):122-133. doi:10.1016/j.jclinepi.2014.06.021 PubMedGoogle ScholarCrossref
30.
Gelman  A, Hill  J.  Data Analysis Using Regression and Multilevelhierarchical Models: Volume 1. Cambridge University Press; 2007.
31.
Calonico  S, Cattaneo  MD, Farrell  MH, Titiunik  R.  Rdrobust: software for regression-discontinuity Designs.   Stata J. 2017;17(2):372-404. doi:10.1177/1536867X1701700208 Google ScholarCrossref
32.
Calonico  S, Cattaneo  MD, Titiunik  R.  Robust nonparametric confidence intervals for regression-discontinuity designs: robust nonparametric confidence intervals.   Econometrica. 2014;82(6):2295-2326. doi:10.3982/ECTA11757 Google ScholarCrossref
33.
Calonico  S, Cattaneo  MD, Farrell  MH.  Optimal bandwidth choice for robust bias-corrected inference in regression discontinuity designs.   Econom J. 2020;23(2):192-210. doi:10.1093/ectj/utz022 Google ScholarCrossref
34.
Calonico  S, Cattaneo  MD, Farrell  MH, Titiunik  R.  Regression discontinuity designs using covariates.   Rev Econ Stat. 2019;101(3):442-451. doi:10.1162/rest_a_00760 Google ScholarCrossref
35.
Imbens  G, Kalyanaraman  K.  Optimal bandwidth choice for the regression discontinuity estimator.   Rev Econ Stud. 2012;79(3):933-959. doi:10.1093/restud/rdr043 Google ScholarCrossref
36.
Gelman  A, Imbens  G.  Why high-order polynomials should not be used in regression discontinuity designs.   J Bus Econ Stat. 2019;37(3):447-456. doi:10.1080/07350015.2017.1366909Google ScholarCrossref
37.
Cattaneo  MD, Jansson  M, Ma  X.  Manipulation testing based on density discontinuity.   Stata J. 2018;18(1):234-261. doi:10.1177/1536867X1801800115 Google ScholarCrossref
38.
McCrary  J.  Manipulation of the running variable in the regression discontinuity design: a density test.   J Econom. 2008;142(2):698-714. doi:10.1016/j.jeconom.2007.05.005 Google ScholarCrossref
39.
Cattaneo  MD, Crump  RK, Farrell  MH, Feng  Y. On Binscatter. arXiv. Preprint posted online February 25, 2019. Accessed April 19, 2021. https://arxiv.org/abs/1902.09608
40.
Sheetz  KH, Ryan  A.  Accuracy of quality measurement for the Hospital Acquired Conditions Reduction Program.   BMJ Qual Saf. 2020;29(7):605-607. doi:10.1136/bmjqs-2019-009747 PubMedGoogle ScholarCrossref
41.
Ody  C, Msall  L, Dafny  LS, Grabowski  DC, Cutler  DM.  Decreases in readmissions credited to Medicare’s program to reduce hospital readmissions have been overstated.   Health Aff (Millwood). 2019;38(1):36-43. doi:10.1377/hlthaff.2018.05178 PubMedGoogle ScholarCrossref
42.
Doran  T, Maurer  KA, Ryan  AM.  Impact of provider incentives on quality and value of health care.   Annu Rev Public Health. 2017;38(1):449-465. doi:10.1146/annurev-publhealth-032315-021457 PubMedGoogle ScholarCrossref
43.
Donabedian  A.  Evaluating the quality of medical care. 1966.   Milbank Q. 2005;83(4):691-729. doi:10.1111/j.1468-0009.2005.00397.xPubMedGoogle ScholarCrossref
44.
Adler-Milstein  J, Zhao  W, Willard-Grace  R, Knox  M, Grumbach  K.  Electronic health records and burnout: time spent on the electronic health record after hours and message volume associated with exhaustion but not with cynicism among primary care clinicians.   J Am Med Inform Assoc. 2020;27(4):531-538. doi:10.1093/jamia/ocz220 PubMedGoogle ScholarCrossref
45.
Scanlon  DP, Christianson  JB, Ford  EW.  Hospital responses to the Leapfrog Group in local markets.   Med Care Res Rev. 2008;65(2):207-231. doi:10.1177/1077558707312499 PubMedGoogle ScholarCrossref
46.
Moran  J, Scanlon  D.  Slow progress on meeting hospital safety standards: learning from the Leapfrog Group’s efforts.   Health Aff (Millwood). 2013;32(1):27-35. doi:10.1377/hlthaff.2011.0056 PubMedGoogle ScholarCrossref
Original Investigation
Health Informatics
September 21, 2021

Association of Hospital Public Quality Reporting With Electronic Health Record Medication Safety Performance

Author Affiliations
  • 1University of California, San Francisco, San Francisco
  • 2Brigham & Women’s Hospital, Harvard Medical School, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
JAMA Netw Open. 2021;4(9):e2125173. doi:10.1001/jamanetworkopen.2021.25173
Key Points

Question  Is receiving publicly reported negative feedback regarding EHR medication safety associated with hospitals improving their performance in the next year?

Findings  In this nonrandomized controlled trial using national data from 1183 hospitals, hospitals that received negative publicly reported feedback improved a statistically significant 8.44 percentage points more on their EHR medication safety performance in the subsequent year compared with hospitals that received positive feedback.

Meaning  These findings suggest that publicly reported quality feedback may be an effective tool to encourage hospital EHR medication safety alerts consistent with current standards.

Abstract

Importance  Despite billions spent in public investment, electronic health records (EHRs) have not delivered on the promise of large quality and safety improvement. Simultaneously, there is debate on whether public quality reporting is a useful tool to incentivize quality improvement.

Objective  To evaluate whether publicly reported feedback was associated with hospital improvement in an evaluation of medication-related clinical decision support (CDS) safety performance.

Design, Settings, and Participants  This nonrandomized controlled trial included US hospitals that participated in the Computerized Provider Order Entry (CPOE) Evaluation Tool in the Leapfrog Hospital Survey, a national quality reporting program that evaluates safety performance of hospital CDS using simulated orders and patients, in 2017 to 2018. A sharp regression discontinuity design was used to identify the association of receiving negative feedback with hospital performance improvement in the subsequent year. Data were analyzed from January through September 2020.

Exposures  Publicly reported quality feedback.

Main Outcomes and Measures  The main outcome was improvement from 2017 to 2018 on the Leapfrog CPOE Evaluation Tool, using regression discontinuity model estimates of the association of receiving negative publicly reported feedback with quality improvement.

Results  A total of 1183 hospitals were included, with a mean (SD) CPOE score of 59.3% (16.3%) at baseline. Hospitals receiving negative feedback improved 8.44 (95% CI, 0.09 to 16.80) percentage points more in the subsequent year compared with hospitals that received positive feedback on the same evaluation. This change was driven by differences in improvement in basic CDS capabilities (β = 8.71 [95%CI, 1.67 to 18.73]) rather than advanced CDS (β = 6.15 [95% CI, −9.11 to 26.83]).

Conclusions and Relevance  In this nonrandomized controlled trial, publicly reported feedback was associated with quality improvement, suggesting targeted measurement and reporting of process quality may be an effective policy lever to encourage improvement in specific areas. Clinical decision support represents an important tool in ensuring patient safety and decreasing adverse drug events, especially for complex patients and those with multiple chronic conditions who often receive several different drugs during an episode of care.

Introduction

The US federal government has spent more $30 billion to digitize the health care system by adopting electronic health records (EHRs).1,2 Despite this investment, the promise of EHRs to dramatically improve quality has remained elusive.3 One mechanism by which EHRs were expected to improve quality was the implementation of computerized provider order entry (CPOE), paired with clinical decision support (CDS) tools. Computerization of drug ordering in particular is associated with reduced rates of adverse drug events, which remain a significant source of patient harm.4-6

CPOE allows physicians and other clinicians to write orders for patients electronically, rather than through verbal or written communication. CDS linked with CPOE then uses EHR data about the patient as well as medication reference databases to supplement clinician decision-making and prevent potential adverse drug events, such as ordering a drug that the patient has a documented allergy to or is likely to have a negative interaction with another drug the patient is using.7 CDS tools intervene at the point of care and alert clinicians to potential adverse drug events before they happen. However, performance outcomes have been mixed, and a significant amount of customization happens at the organization level, resulting in heterogeneity even within hospitals using the same technology.6-9 As a result, while medication-related safety performance has improved, there is major progress to be made, with hospitals correctly alerting clinicians to fewer than two-thirds of potential adverse drug events.8

One potential policy mechanism to incentivize quality improvement is public reporting of performance. Public quality reporting has been theorized to reduce information asymmetries and increase patient welfare while providing low-performing hospitals incentive to improve.10-13 However, recent evidence has cast doubt on whether national programs are useful tools in identifying high quality,14 and while some programs, such as the Joint Commission, do provide a quality floor that all nearly all hospitals must meet, they may not help distinguish higher or lower quality among accredited hospitals, and it is unclear if public reporting is an effective way to encourage improvement. Public quality reporting may also unfairly penalize hospitals that serve low-income and disadvantaged populations,15,16 and result in patient selection to avoid patients who are more seriously ill.17 While hospitals may improve when they participate in these programs,18,19 pay-for-performance initiatives intended to reward high quality through financial incentives have found mixed results,20 with the potential for hospitals to game these systems and harm patients.21

Little evidence exists regarding the impact of public EHR quality reporting. The largest national evaluation of EHR safety is the CPOE Evaluation Tool of the Leapfrog Hospital Survey, an assessment using simulated patients and orders to evaluate whether the hospital EHR correctly generates CDS alerts for potential adverse drug events.22 The evaluation is derived from historical patients and orders that caused patient harm, and the results of the CPOE evaluation are included as one of many quality measures publicly reported on Leapfrog’s website. Empirical evidence has found the Leapfrog ratings are associated with outcome quality,23 and that performance on the CPOE Evaluation Tool was correlated with lower rates of adverse drug events,24 with some evidence suggesting that hospitals who use the CPOE Evaluation Tool multiple times improve with experience.6 However, to our knowledge, there has been no examination of whether the CPOE Evaluation Tool encourages quality improvement.

We used national data from the Leapfrog CPOE Evaluation Tool to identify the association of publicly reported feedback with EHR medication safety performance. Leveraging a change in the scoring in the CPOE Evaluation in 2017, we used a regression discontinuity design to answer 2 research questions; first, do hospitals that receive negative feedback regarding their safety performance improve more in the subsequent year compared with hospitals that receive positive feedback, and second, do hospitals make those improvements in basic or advanced CDS capabilities?

Methods

This nonrandomized controlled trial was deemed exempt from ethics board review by the University of Utah institutional review board, the institution that facilitated the data collection. The need for informed consent was waived per institutional policy because this study did not involve any real patients or real patient data. This study is reported following the Transparent Reporting of Evaluations with Nonrandomized Designs (TREND) reporting guideline.

Design and Administration of the CPOE Evaluation Tool

The CPOE Evaluation Tool is a test designed by researchers at the Brigham and Women’s Hospital and University of Utah and administered by the Leapfrog Group.22 The CPOE Evaluation Tool is included as part of the Leapfrog Hospital Survey and is one of several process quality measures used by the Leapfrog Group in their evaluation and rating of US hospitals.

The CPOE Evaluation Tool uses simulated patients and medication orders, input into the hospital’s EHR system, that mimic the experience of a clinician writing orders for actual patients to evaluate safety performance. The simulated patients and orders were designed to test the performance of CPOE paired with CDS to prevent potential adverse drug events most likely to cause serious harm to patients, and the orders were based on real-world incidents of preventable adverse drug events from patients who experienced serious harm or death.9,25 Simulated orders were divided into subcategories based on the type of adverse drug events they represented and grouped into 2 categories: orders with potential adverse events prevented by basic CDS (ie, drug-allergy, drug-route, drug-drug, drug-dose for single doses, and therapeutic duplication) and those that require advanced CDS (ie, drug-laboratory, drug-dose for daily doses, drug-age, drug-diagnosis, and corollary orders contraindications).25 The number of orders per hospital varies, and if a medication used in an order is not on the hospitals formulary or otherwise not prescriptible, it is excluded. The primary outcome measure was whether the hospital EHR system correctly generated an alert, either a CDS pop-up alert or a hard stop that prevented the clinician from submitting the order after entering an order that could result in an adverse drug event. The overall score, expressed as a percentage, is the number of orders correctly alerted on divided by the number of total orders that should generate alerts.

Administration of the test is performed by a team of hospital representatives at each participating hospital, with a clinician entering test medication orders into the EHR and recording in detail how the EHR responds. All simulated patients and orders in the evaluation are input during a single session. A hospital representative enters those responses into the evaluation tool and is presented with a qualitative feedback score, as well as percentage scores across the subcategories, but not the details of individual orders the hospital EHR did not correctly alert on, except in the case of orders in which the error would likely result in a fatality. The qualitative feedback scores for the 2017 test were “Full Demonstration of Safety Standards” for hospitals whose overall score was 50% or greater, “Substantial Demonstration of Safety Standards” for hospitals between 30% and 49.99%, and “Some Demonstration of Safety Standards” for hospitals less than 30%. Prior to 2017, the overall score and qualitative feedback was generated based on a different formula without sharp cutoff points. To determine whether hospitals are overalerting on safe orders that should not generate alerts, the test also includes several nuisance orders that would not cause any patient harm. If hospitals alert on these orders, they appear in the hospital nuisance score but does not impact their overall score.26 To prevent gaming of the test, several control orders that should not generate CDS alerts are included, and the test process is timed so that hospitals cannot take longer than 6 hours to complete the full test and input all of the simulated orders, although most hospitals complete the test in 2 to 3 hours. Hospitals that exceed the time threshold or report alerts on too many control orders are disqualified, although this amounts to less than 1% of hospitals each year. The Leapfrog Group audits hospitals to ensure accuracy.

Data Collection and Sample

The sample included all hospitals that took the Leapfrog CPOE Evaluation Tool in both calendar year 2017 and 2018. These data were merged with American Hospital Association Annual Survey data from 2017 and 2018 to capture hospital demographic information. The final analytic sample included 1183 hospitals in a balanced panel from January 1, 2017 to December 31, 2018.

Statistical Analysis

We calculated descriptive statistics for hospitals in the sample taking the Leapfrog CPOE Evaluation Tool in calendar years 2017 and 2018, including mean overall score and SD in both years as well as mean change in score between years. We also described the sample by hospital characteristics, including size, teaching status, health system membership, rural vs urban location, and region in the United States.

To estimate the association of providing hospitals with qualitative feedback on safety performance, we used a sharp regression discontinuity design using the 50% cutoff for Full Demonstration feedback. This compares hospitals that narrowly received the negative Substantial Demonstration feedback with hospitals that narrowly received the Full Demonstration feedback. The identifying assumption was that hospitals on either side of the cutoff were similar across other measures that might impact performance.27-29

We estimated the standard ordinary least squares regression discontinuity model,30 in which the dependent variable was overall CPOE Evaluation Tool score change from 2017 to 2018, expressed as percentage points. Our independent variable of interest was a binary indicator for whether the hospital received the Substantial Demonstration feedback, and the running variable was overall CPOE Evaluation score in 2017. We estimated this model with and without hospital demographic characteristics as controls. We then estimated the local average treatment effect nonparametrically using local polynomial inference developed by Calonico et al.31-35 This model fits local linear regressions on either side of the cutoff, using a data-driven bandwidth selection procedure to optimize the bias-variance tradeoff. We estimated this model with and without hospital covariates (including hospital size, teaching status, health system membership, rurality, and census region) and with bias-adjusted robust SEs clustered at the hospital-level.

We then examined the mechanism by which improvement occurs using this model separately on basic and advanced CDS. We calculated separate scores, ranging from 0% to 100% in the same way as the overall score, separately for each component: the orders that fall under the basic CDS category, and then the orders in the advanced CDS category. All analysis was conducted in Stata statistical software version 16.1 (StataCorp) with the rdrobust package, with 2-sided α = .05. Data were analyzed from January through September 2020.

To ensure these results were robust to a wide array of different possible specifications, we estimated the model 60 different times, varying different aspects of the specification including bandwidth selection (using several manual as well as different data-driven algorithms), local polynomial order (local linear compared to quadratic),36 inclusion of covariates, kernel choice for weighting observations near the cutoff point (triangle, Epanechnikov, and uniform), and SE calculations. We then plotted the point estimates and 95% CIs for the treatment effect in a specification curve (eFigure 1 in the Supplement).

We conducted a series of tests to ensure that the assumptions necessary for association identification were likely to be met. First, to ensure that no manipulation of the running variable was present, we plotted the density of the overall score in 2017 and evaluated whether empirical evidence of manipulation was present using the procedure outlined by Cattaneo et al,37 building on the McCrary test,38 and found no evidence of manipulation (eFigure 2 in the Supplement). We plotted the distribution of the hospital demographics across the running variable (eFigure 3 in the Supplement). We conducted a series of placebo tests estimating the model at alternative cutoff points and found no statistically significant results at any of placebo cutoffs (eTable 1 in the Supplement). We ran several versions of the model testing EHR vendor effects, including subsamples using only the 3 most commonly used vendors (eTable 2 in the Supplement) as well as including dummy variables to control for vendor (eTable 3 in the Supplement). We also ran the model including previous experience with the Leapfrog CPOE Evaluation (eTable 4 in the Supplement). Finally, to test whether hospital improvement was associated with increased unnecessary nuisance alerts, we used the same sharp regression discontinuity model at the 50% treatment assignment cutoff on a dependent variable of nuisance alert score change and found no evidence this occurred (eFigure 4 in the Supplement).

Results
Sample Descriptive Statistics

A total of 1183 hospitals were included, with a mean (SD) 2017 COPE score of 59.3% (16.3%). Hospital overall scores on the CPOE Evaluation Tool improved to a mean (SD) of 66.5% (14.9%) in 2018. Most hospitals in the sample were medium-sized (ie, 100-399 beds; 721 hospitals [60.9%]), followed by large hospitals with more than 400 beds (247 hospitals [20.9%]) and then small hospitals with fewer than 100 beds (215 hospitals [18.2%]). Most hospitals in the sample were teaching hospitals (674 hospitals [57.0%]), members of a health system (187 hospitals [84.2%]), and located in urban areas (1047 hospitals [88.5%]) (Table 1).

Hospital Improvement in Response to Feedback

We identified a clear discontinuity in hospital improvement in the subsequent year at the cutoff point between Full Demonstration and Substantial Demonstration feedback in 2017 (Figure).39 The ordinary least squares model found a significant association of providing hospitals with the negative Substantial Demonstration feedback with improvement in the subsequent year without covariates (β = 4.71 [95% CI, 1.53 to 7.89]) and with covariates (β = 4.82 [95% CI, 1.65 to 8.01]). The Calonico et al31-35 model also found that hospitals that received the Substantial Demonstration feedback improved 8.44 (95% CI, 0.09-16.80) percentage points more compared with hospitals on the other side of the discontinuity (9.19 [95% CI, 0.36-18.02] percentage points with covariates) (Table 2).

These results were robust to a wide array of modeling choices, shown in the specification curve in eFigure 1 in the Supplement. Of 60 different specifications tested, all but 3 produced a statistically significant result, and the 3 tests that were not significant were directionally consistent with other specifications. The results were also consistent when evaluating subsamples of the 3 largest EHR vendors (eTable 2 in the Supplement), as well as including controls for EHR vendor (eTable 3 in the Supplement) and previous Leapfrog CPOE Evaluation experience (eTable 4 in the Supplement).

Improvement Mechanism

The primary mechanism for the observed change was through improvement in the safety performance of alerting on potential adverse drug events classified as basic CDS. Using the same robust, bias-corrected estimator with data-driven bandwidth selection, we observed a statistically significant change associated with receiving negative Substantial Demonstration feedback on hospital improvement in basic CDS performance both without covariates (β = 8.71 [95% CI, 1.67 to 18.73]) and with covariates (β = 8.80 [95% CI, 2.14 to 15.44]). There was no association of improvement with advanced CDS (without covariates: β = 6.15 [95% CI, −9.11 to 26.83]; with covariates: β = 8.73 [95% CI, −8.23 to 31.31]) (Table 3).

Discussion

The US health care system has made an enormous investment in EHRs, but it is not clear this has resulted in significant safety improvements. One policy lever to encourage quality improvement is public reporting, yet few studies have empirically examined the effectiveness of EHR-focused quality reporting, and most existing studies are descriptive reports of the state CPOE quality. Using a nonrandomized controlled regression discontinuity design, we evaluated the association of providing hospitals that participated in a voluntary EHR safety performance assessment with negative publicly reported feedback with quality improvement. We found that hospitals that received the negative feedback Substantial Demonstration of Safety Standards, rather than the positive Full Demonstration of Safety Standards, improved significantly more in the subsequent year. That improvement was driven by safety gains from basic CDS, such as drug-drug or drug-allergy contraindications, rather than advanced CDS, such as corollary orders or daily drug dosing contraindications.

These results contribute to our understanding of hospital quality measurement and publicly reported performance feedback, on which the literature on improvement is mixed. Many studies of hospital response to quality measurement have focused on pay-for-performance initiatives, such as the Hospital Readmission Reduction Program, that target outcome quality. Despite efforts to risk-adjust these programs, outcome quality is only partially within the control of hospitals, and measurement may be noisy.40 The result has been an ongoing debate over whether improvement in these programs could be a result of hospitals selecting against patients who are more seriously ill or otherwise gaming the measures rather than improving quality.21,41,42 In contrast, the Leapfrog CPOE Evaluation Tool is a measure of process quality that evaluates whether or not an alert is correctly triggered when best practices suggest an order may cause an adverse drug event. While process quality evaluations do not directly measure patient harm, they do accurately measure an aspect of care almost entirely within the control of the hospital.43 Therefore, hospitals are more likely to be able to respond to feedback and improve, rather than having incentive to simply select patients less likely to negatively impact their scores.

The Leapfrog CPOE Evaluation Tool differs from other quality evaluations in its scope, focusing narrowly on evaluating EHR performance in alerting clinicians to potential adverse drug events. This is in contrast to broad quality programs, like CMS Hospital Compare or the Joint Commission, which are composite measures of many aspects of quality. By providing feedback on a more targeted dimension of quality, hospitals may be more able to quickly identify opportunities for improvement and then act on those to improve their score in the next year. Hospitals that received negative feedback may have allocated more organizational resources on improving EHR medication safety or enabled stricter CDS alerting, thereby realizing nearly immediate quality gains. Our results showing that the mechanism for improved performance was basic decision support, which may be easier to build or enable within a single year, may support this hypothesis. Therefore, policy makers designing quality incentives may wish to consider targeted measures of process quality, like the Leapfrog CPOE Evaluation Tool. Additionally, future research should also examine whether improvements in process quality measures, such as the Leapfrog CPOE Evaluation, translate into improvements in outcome quality measures, such as adverse drug event rates, mortality, and patient experience.

Limitations

This study has several limitations. First, regression discontinuity models estimate a local average treatment effect, and hospitals in different parts of the quality distribution may have different responses to feedback. Because so few hospitals scored less than 35% in 2017, we were unable to evaluate the association of receiving lower scores with subsequent quality improvement. Second, while the Leapfrog CPOE Evaluation Tool has been associated with actual reductions in preventable adverse drug events,24 it is a measure of process quality rather than outcome quality, and higher scores on the evaluation may not necessarily translate into better outcomes, and the most recent study to assess the association between Leapfrog CPOE Evaluation score and rates of adverse drug events in hospitals was published in 2013.24 Additionally, the Leapfrog CPOE Evaluation Tool does not capture forms of CDS that happen upstream of the order, such as the use of condition-specific order sets with appropriate defaults. Third, the Leapfrog CPOE Evaluation Tool is voluntary, and it is likely there is a selection into the sample in that hospitals that choose to participate in quality measurement are more motivated and likely to respond to negative feedback and improve. Fourth, while our analysis of nuisance alert scoring suggests that hospitals are not improving by burdening clinicians with low-value alerts, it is important to balance safety features and the potential impact on clinician well-being through alert fatigue and burnout.44 Fifth, because our identification takes advantage of a sharp cutoff in scoring that was new in 2017, we are only able to use 2 years of data rather than the full history of Leapfrog CPOE evaluations.45,46

Conclusions

This nonrandomized controlled trial using data from a national evaluation of EHR medication safety found that hospitals that received publicly reported negative feedback improved quality 8.4 percentage points more than those that received positive feedback in the subsequent year. This outcome was driven by improvement in basic CDS, rather than more advanced CDS capabilities. Despite this progress, there is still considerable room for improvement, with few hospitals receiving a perfect score, suggesting that all hospitals may benefit from continued assessment of this type. These results suggest that publicly reported feedback on specific dimensions of quality may lead to improvement.

Back to top
Article Information

Accepted for Publication: July 13, 2021.

Published: September 21, 2021. doi:10.1001/jamanetworkopen.2021.25173

Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2021 Holmgren AJ et al. JAMA Network Open.

Corresponding Author: A. Jay Holmgren, PhD, MHI, University of California, San Francisco, 10 Koret Way, Office 327A, San Francisco, CA 94131 (a.holmgren@ucsf.edu).

Author Contributions: Dr Holmgren had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Concept and design: All authors.

Acquisition, analysis, or interpretation of data: All authors.

Drafting of the manuscript: Holmgren.

Critical revision of the manuscript for important intellectual content: All authors.

Statistical analysis: Holmgren.

Conflict of Interest Disclosures: Dr Bates reported receiving grants from EarlySense and IBM Watson; personal fees from CDI Negev, ValeraHealth, and FeelBetter; owning equity in CLEW, MDClone, and serving as a consultant for Leapfrog outside the submitted work. No other disclosures were reported.

References
1.
Blumenthal  D.  Launching HITECH.   N Engl J Med. 2010;362(5):382-385. doi:10.1056/NEJMp0912825 PubMedGoogle ScholarCrossref
2.
Corrigan  JM. Crossing the quality chasm. In: Reid  PP, Compton  WD, Grossman  JH,  et al, eds.  Building a Better Delivery System: A New Engineering/Health Care Partnership. National Academies Press; 2005.
3.
Halamka  JD, Tripathi  M.  The HITECH era in retrospect.   N Engl J Med. 2017;377(10):907-909. doi:10.1056/NEJMp1709851 PubMedGoogle ScholarCrossref
4.
Bates  DW.  Preventing medication errors: a summary.   Am J Health Syst Pharm. 2007;64(14)(suppl 9):S3-S9. doi:10.2146/ajhp070190 PubMedGoogle Scholar
5.
Bates  DW, Leape  LL, Cullen  DJ,  et al.  Effect of computerized physician order entry and a team intervention on prevention of serious medication errors.   JAMA. 1998;280(15):1311-1316. doi:10.1001/jama.280.15.1311 PubMedGoogle ScholarCrossref
6.
Holmgren  AJ, Co  Z, Newmark  L, Danforth  M, Classen  D, Bates  D.  Assessing the safety of electronic health records: a national longitudinal study of medication-related decision support.   BMJ Qual Saf. 2020;29(1):52-59. doi:10.1136/bmjqs-2019-009609PubMedGoogle ScholarCrossref
7.
Kuperman  GJ, Bobb  A, Payne  TH,  et al.  Medication-related clinical decision support in computerized provider order entry systems: a review.   J Am Med Inform Assoc. 2007;14(1):29-40. doi:10.1197/jamia.M2170 PubMedGoogle ScholarCrossref
8.
Classen  DC, Holmgren  AJ, Co  Z,  et al.  National trends in the safety performance of electronic health record systems from 2009 to 2018.   JAMA Netw Open. 2020;3(5):e205547. doi:10.1001/jamanetworkopen.2020.5547PubMedGoogle Scholar
9.
Chaparro  JD, Classen  DC, Danforth  M, Stockwell  DC, Longhurst  CA.  National trends in safety performance of electronic health record systems in children’s hospitals.   J Am Med Inform Assoc. 2017;24(2):268-274. doi:10.1093/jamia/ocw134 PubMedGoogle ScholarCrossref
10.
Dranove  D, Satterthwaite  MA.  Monopolistic competition when price and quality are imperfectly observable.   RAND J Econ. 1992;23(4):518-534. doi:10.2307/2555903 Google ScholarCrossref
11.
Werner  RM, Bradlow  ET.  Relationship between Medicare’s hospital compare performance measures and mortality rates.   JAMA. 2006;296(22):2694-2702. doi:10.1001/jama.296.22.2694 PubMedGoogle ScholarCrossref
12.
Clarke  CA, Asch  SM, Baker  L,  et al.  Public reporting of hospital-level cancer surgical volumes in California: an opportunity to inform decision making and improve quality.   J Oncol Pract. 2016;12(10):e944-e948. doi:10.1200/JOP.2016.010819 PubMedGoogle ScholarCrossref
13.
Bardach  NS, Hibbard  JH, Greaves  F, Dudley  RA.  Sources of traffic and visitors’ preferences regarding online public reports of quality: web analytics and online survey results.   J Med Internet Res. 2015;17(5):e102-e102. doi:10.2196/jmir.3637 PubMedGoogle ScholarCrossref
14.
Lam  MB, Figueroa  JF, Feyman  Y, Reimold  KE, Orav  EJ, Jha  AK.  Association between patient outcomes and accreditation in US hospitals: observational study.   BMJ. 2018;363:k4011. doi:10.1136/bmj.k4011PubMedGoogle Scholar
15.
Wan  W, Liang  CJ, Duszak  R  Jr, Lee  CI.  Impact of teaching intensity and sociodemographic characteristics on CMS Hospital Compare quality ratings.   J Gen Intern Med. 2018;33(8):1221-1223. doi:10.1007/s11606-018-4442-6 PubMedGoogle ScholarCrossref
16.
Fahrenbach  J, Chin  MH, Huang  ES, Springman  MK, Weber  SG, Tung  EL.  Neighborhood disadvantage and hospital quality ratings in the Medicare Hospital Compare program.   Med Care. 2020;58(4):376-383. doi:10.1097/MLR.0000000000001283 PubMedGoogle ScholarCrossref
17.
Dranove  D, Kessler  D, McClellan  M, Satterthwaite  M.  Is more information better: the effects of “report cards” on health care providers.   J Polit Econ. 2003;111(3):555-588. doi:10.1086/374180Google ScholarCrossref
18.
Lindenauer  PK, Remus  D, Roman  S,  et al.  Public reporting and pay for performance in hospital quality improvement.   N Engl J Med. 2007;356(5):486-496. doi:10.1056/NEJMsa064964PubMedGoogle ScholarCrossref
19.
Bogh  SB, Falstie-Jensen  AM, Hollnagel  E, Holst  R, Braithwaite  J, Johnsen  SP.  Improvement in quality of hospital care during accreditation: a nationwide stepped-wedge study.   Int J Qual Health Care. 2016;28(6):715-720. doi:10.1093/intqhc/mzw099 PubMedGoogle Scholar
20.
Werner  RM, Kolstad  JT, Stuart  EA, Polsky  D.  The effect of pay-for-performance in hospitals: lessons for quality improvement.   Health Aff (Millwood). 2011;30(4):690-698. doi:10.1377/hlthaff.2010.1277 PubMedGoogle ScholarCrossref
21.
Gupta  A.  Impacts of performance pay for hospitals: the Readmissions Reduction Program.   SSRN Journal. Published online 2017. doi:10.2139/ssrn.3054172 Google Scholar
22.
The Leapfrog Group. Survey overview. Accessed February 24, 2019. https://www.leapfroggroup.org/survey-materials/survey-overview
23.
Jha  AK, Orav  EJ, Ridgway  AB, Zheng  J, Epstein  AM.  Does the Leapfrog program help identify high-quality hospitals?   Jt Comm J Qual Patient Saf. 2008;34(6):318-325. doi:10.1016/S1553-7250(08)34040-9 PubMedGoogle Scholar
24.
Leung  AA, Keohane  C, Lipsitz  S,  et al.  Relationship between medication event rates and the Leapfrog computerized physician order entry evaluation tool.   J Am Med Inform Assoc. 2013;20(e1):e85-e90. doi:10.1136/amiajnl-2012-001549 PubMedGoogle ScholarCrossref
25.
Metzger  J, Welebob  E, Bates  DW, Lipsitz  S, Classen  DC.  Mixed results in the safety performance of computerized physician order entry.   Health Aff (Millwood). 2010;29(4):655-663. doi:10.1377/hlthaff.2010.0160 PubMedGoogle ScholarCrossref
26.
Co  Z, Holmgren  AJ, Classen  DC,  et al.  The tradeoffs between safety and alert fatigue: data from a national evaluation of hospital medication-related clinical decision support.   J Am Med Inform Assoc. 2020;27(8):1252-1258. doi:10.1093/jamia/ocaa098PubMedGoogle ScholarCrossref
27.
Maciejewski  ML, Basu  A.  Regression discontinuity design.   JAMA. 2020;324(4):381-382. doi:10.1001/jama.2020.3822 PubMedGoogle ScholarCrossref
28.
Guduguntla  V, McWilliams  JM.  Exploiting clinical decision-making thresholds to recover causal effects from observational data: randomization without trials.   JAMA Intern Med. 2021;181(6):774-775. doi:10.1001/jamainternmed.2021.0923 PubMedGoogle ScholarCrossref
29.
Moscoe  E, Bor  J, Bärnighausen  T.  Regression discontinuity designs are underutilized in medicine, epidemiology, and public health: a review of current and best practice.   J Clin Epidemiol. 2015;68(2):122-133. doi:10.1016/j.jclinepi.2014.06.021 PubMedGoogle ScholarCrossref
30.
Gelman  A, Hill  J.  Data Analysis Using Regression and Multilevelhierarchical Models: Volume 1. Cambridge University Press; 2007.
31.
Calonico  S, Cattaneo  MD, Farrell  MH, Titiunik  R.  Rdrobust: software for regression-discontinuity Designs.   Stata J. 2017;17(2):372-404. doi:10.1177/1536867X1701700208 Google ScholarCrossref
32.
Calonico  S, Cattaneo  MD, Titiunik  R.  Robust nonparametric confidence intervals for regression-discontinuity designs: robust nonparametric confidence intervals.   Econometrica. 2014;82(6):2295-2326. doi:10.3982/ECTA11757 Google ScholarCrossref
33.
Calonico  S, Cattaneo  MD, Farrell  MH.  Optimal bandwidth choice for robust bias-corrected inference in regression discontinuity designs.   Econom J. 2020;23(2):192-210. doi:10.1093/ectj/utz022 Google ScholarCrossref
34.
Calonico  S, Cattaneo  MD, Farrell  MH, Titiunik  R.  Regression discontinuity designs using covariates.   Rev Econ Stat. 2019;101(3):442-451. doi:10.1162/rest_a_00760 Google ScholarCrossref
35.
Imbens  G, Kalyanaraman  K.  Optimal bandwidth choice for the regression discontinuity estimator.   Rev Econ Stud. 2012;79(3):933-959. doi:10.1093/restud/rdr043 Google ScholarCrossref
36.
Gelman  A, Imbens  G.  Why high-order polynomials should not be used in regression discontinuity designs.   J Bus Econ Stat. 2019;37(3):447-456. doi:10.1080/07350015.2017.1366909Google ScholarCrossref
37.
Cattaneo  MD, Jansson  M, Ma  X.  Manipulation testing based on density discontinuity.   Stata J. 2018;18(1):234-261. doi:10.1177/1536867X1801800115 Google ScholarCrossref
38.
McCrary  J.  Manipulation of the running variable in the regression discontinuity design: a density test.   J Econom. 2008;142(2):698-714. doi:10.1016/j.jeconom.2007.05.005 Google ScholarCrossref
39.
Cattaneo  MD, Crump  RK, Farrell  MH, Feng  Y. On Binscatter. arXiv. Preprint posted online February 25, 2019. Accessed April 19, 2021. https://arxiv.org/abs/1902.09608
40.
Sheetz  KH, Ryan  A.  Accuracy of quality measurement for the Hospital Acquired Conditions Reduction Program.   BMJ Qual Saf. 2020;29(7):605-607. doi:10.1136/bmjqs-2019-009747 PubMedGoogle ScholarCrossref
41.
Ody  C, Msall  L, Dafny  LS, Grabowski  DC, Cutler  DM.  Decreases in readmissions credited to Medicare’s program to reduce hospital readmissions have been overstated.   Health Aff (Millwood). 2019;38(1):36-43. doi:10.1377/hlthaff.2018.05178 PubMedGoogle ScholarCrossref
42.
Doran  T, Maurer  KA, Ryan  AM.  Impact of provider incentives on quality and value of health care.   Annu Rev Public Health. 2017;38(1):449-465. doi:10.1146/annurev-publhealth-032315-021457 PubMedGoogle ScholarCrossref
43.
Donabedian  A.  Evaluating the quality of medical care. 1966.   Milbank Q. 2005;83(4):691-729. doi:10.1111/j.1468-0009.2005.00397.xPubMedGoogle ScholarCrossref
44.
Adler-Milstein  J, Zhao  W, Willard-Grace  R, Knox  M, Grumbach  K.  Electronic health records and burnout: time spent on the electronic health record after hours and message volume associated with exhaustion but not with cynicism among primary care clinicians.   J Am Med Inform Assoc. 2020;27(4):531-538. doi:10.1093/jamia/ocz220 PubMedGoogle ScholarCrossref
45.
Scanlon  DP, Christianson  JB, Ford  EW.  Hospital responses to the Leapfrog Group in local markets.   Med Care Res Rev. 2008;65(2):207-231. doi:10.1177/1077558707312499 PubMedGoogle ScholarCrossref
46.
Moran  J, Scanlon  D.  Slow progress on meeting hospital safety standards: learning from the Leapfrog Group’s efforts.   Health Aff (Millwood). 2013;32(1):27-35. doi:10.1377/hlthaff.2011.0056 PubMedGoogle ScholarCrossref
×