Each hospital is represented by the point estimate for its adjusted odds ratio, along with 95% confidence intervals (error bars). High-quality outliers are highlighted in blue, low-quality outliers in red, and the identity of the hospital receiving the report in green. A, All-cause trauma; B, blunt trauma; C, gunshot wound trauma. A similar report card was sent to hospitals that used International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes for injury coding based on an ICD-9-CM–based version of the Trauma Mortality Probability Model (not shown).
Point estimates for hospital-adjusted odds ratio for all trauma cases vs specific injury mechanism. A, All-cause trauma vs blunt trauma; B, all-cause trauma vs gunshot wound (GSW) trauma; all-cause trauma vs pedestrian trauma; D, all-cause trauma vs motor vehicle accident trauma; and E, all-cause trauma vs high-risk trauma (mortality risk >25%). The 95% confidence intervals (error bars) around the odds ratio are shown when they do not include 1.
Glance LG, Mukamel DB, Meredith W, Dick AW. Hospital Performance in Caring for Injured PatientsDoes the Type of Injury Make a Difference?. Arch Surg. 2009;144(12):1121–1126. doi:10.1001/archsurg.2009.218
To determine whether quality measures based on injury-specific models provide a different perspective about relative hospital rankings compared with a single outcome measure based on all trauma patients.
We customized the Trauma Mortality Probability Model to create separate injury-specific models for patients who sustained blunt trauma, gunshot wounds, pedestrian trauma, or motor vehicle accident trauma.
This analysis was conducted using the National Trauma Data Bank. We limited the study to hospitals with 250 or more trauma admissions per year, which coded more than 90% of patients.
The final data set included 54 859 patients admitted to 44 hospitals.
Main Outcome Measures
We performed hospital-level analyses to examine the correlation between hospital risk-adjusted mortality measures based on all trauma patients vs quality measures based on injury-specific measures.
The analysis of the intraclass correlation coefficients suggests fair-to-substantial agreement (0.39-0.68) between the hospital-adjusted odds ratios based on all patients vs odds ratios based on specific injuries. κ Analysis demonstrated poor-to-fair agreement between hospital categorical quality measures (high, intermediate, and low quality) when hospital quality was based on outcomes for all trauma patients vs specific subgroups of patients (0.0-0.38). However, none of the hospitals classified as high quality, based on data from all trauma patients, was found to be low quality for any specific injury populations.
A single composite measure based on all injured patients may not capture all the differences in hospital quality across different populations of injured patients.
Even among level I trauma centers, there is considerable variability in outcomes.1 Improving patient population outcomes could be facilitated by a collaborative quality improvement model2 in which hospitals use feedback on their risk-adjusted outcomes, coupled with information on best practices, to improve their performance. Although selective referral to high-quality hospitals has been advocated for high-risk surgery,3 trauma care is already heavily regionalized: severely injured patients are typically triaged to level I and II centers that have the resources to care for such patients. Further redistribution of patients based on evidence-based selective referral of patients to high-quality centers may not be practical. Many trauma patients currently do not have timely access to a single level I or II trauma center,4 let alone the option of choosing among several different centers. Furthermore, a policy that seeks to divert patients to only the highest-quality centers is likely to be disruptive and impractical.5 Finally, many hospitals are already faced with the economic burden of caring for trauma patients6 and might like nothing better than to eliminate their trauma services. A policy of selective referral could further limit access to trauma centers for many patients and lead to worse outcomes by increasing travel distances for seriously injured patients.
We received funding from the Agency for Healthcare Research and Quality to create a hospital trauma report card based on risk-adjusted mortality measures for hospitals that contribute data to the National Trauma Data Bank (NTDB). The Survival Measurement and Reporting Trial for Trauma (SMARTT) study, in collaboration with the American College of Surgeons Committee on Trauma, will test whether providing hospitals with confidential feedback on their trauma outcomes leads to improvement of those outcomes. These nonpublic hospital reports are based on a new risk-adjustment model called the Trauma Mortality Probability Model (TMPM).7 This new model is based on empirical estimates of injury severity, in contrast to the injury severity score,8 which uses expert-based consensus to assign injury severities for each injury code. Besides reporting on all-cause trauma mortality, we also include in the report card risk-adjusted mortality measures for several different categories of trauma, including blunt trauma, gunshot wounds, pedestrian trauma, and motor vehicle accident trauma, and patients with a mortality risk greater than 25%. A priori, we decided that providing trauma hospitals with this additional information about outcomes would allow them to better target their quality improvement efforts than if they received information only on their overall risk-adjusted mortality. Previous studies9,10 have suggested that outcomes across high-risk surgery, such as coronary artery bypass grafting and valve surgery, are correlated because hospital structural variables and process measures that impact 1 procedure may also impact outcomes for related procedures. We hypothesize that this would also be true for trauma. However, we also believe that it is unlikely that a measure of global performance would have sufficient granularity to allow trauma centers to precisely assess their performance in specific types of trauma. In this article, we report the extent to which hospital performance for patients with specific injuries correlates with overall trauma outcomes.
We used the NTDB (version 7.0) to study patients admitted with traumatic injuries in 2006. The NTDB was created and is managed by the American College of Surgeons to serve as the “principal national repository for trauma center registry data.”11(p6) The data elements in the NTDB include patient demographics, hospital demographics, Abbreviated Injury Score (AIS) codes, mechanism of injury (based on International Classification of Diseases, Ninth Revision, Clinical Modification12 codes), encrypted hospital identifiers, physiology values, and outcomes.
We limited the study to hospitals with 250 or more trauma admissions per year, which coded more than 90% of patients: 147 163 patients in 106 hospitals. We excluded hospitals that were missing or had invalid data on patient outcome, Ecodes, demographic information, and physiologic information (motor component of the Glasgow Coma Scale and blood pressure) on more than 0.5%, 2%, 5%, and 20% of their patients, respectively, which resulted in a data set of 62 606 patients from 46 hospitals. We also eliminated hospitals that transferred more than 25% of their patients, which left us with 44 hospitals. Patients with burns or nontrauma diagnoses (eg, poisoning, drowning, suffocation) (n=3314), missing or invalid AIS codes (n=2405), missing or invalid data for age, sex, or outcome (n=98), or age younger than 1 year (n=271) were excluded. Patients who were dead on arrival (n=278) or transferred to another facility (n=1115) were also excluded.
Patients older than 89 years are coded using an indicator variable in the NTDB and were assigned the value of 95 years for their age. Ecodes were mapped to 1 of 6 injury mechanisms7: gunshot wound, stab wound, low fall, blunt injury, motor vehicle accident, and pedestrian injury. The final data set included 54 859 patients admitted to 44 hospitals (some patients were excluded from the analysis owing to more than 1 exclusion criterion).
We use hierarchical logistic regression modeling to estimate the effect of hospital quality on in-hospital mortality after adjusting for patient case mix and severity of disease. The development and validation of the TMPM have been previously described.7 The TMPM is based on empirical measures of injury severity estimated using the NTDB, in contrast to the injury severity score, which relies on AIS severity scores assigned by experts. We augmented the TMPM by including information on patient age, sex, mechanism of injury, the motor component of the Glasgow Coma Scale, systolic blood pressure, and transfer-in status. Multiple imputation was used to impute missing values of the motor component of the Glasgow Coma Scale and systolic blood pressure, using the STATA implementation13 of the method of multiple imputation described by van Buuren et al.14 Some of us15 had previously used Monte Carlo simulation to demonstrate that hospital quality measures based on the TMPM with imputed data are nearly identical to those based on a data set without missing values. The method of fractional polynomials was used to determine the optimal transformation for age.16 Model discrimination was evaluated using the C statistic, and calibration was evaluated by means of the Hosmer-Lemeshow statistic.17,18
Hospital effects were obtained by exponentiation of the hospital shrinkage estimators to obtain an adjusted odds ratio to quantify hospital performance. Shrinkage estimators based on the imputed data sets were combined using Rubin's rule.18 The hospital odds ratio is a measure of the likelihood that a patient treated at a specific hospital will die compared with the likelihood that the patient would die if treated at an average hospital. Hierarchical modeling explicitly quantifies hospital performance, adjusts for the clustering of outcomes within hospitals, and avoids the problem of regression to the mean.19 In the context of health care, regression to the mean signifies that hospitals identified in the initial report as offering services of extreme quality are likely to be described as having less extreme quality in a subsequent report. Separate models were created for patients who sustained blunt trauma, gunshot wounds, pedestrian trauma, or trauma associated with a motor vehicle accident, as well as for a high-risk group whose risk of mortality was greater than 25%, by customizing the TMPM to these patient populations. Hospitals whose adjusted odds ratios were significantly greater than 1 (95% confidence interval did not include 1) were classified as low-quality outliers. Conversely, hospitals whose adjusted odds ratio was significantly less than 1 were classified as high-quality outliers.
We performed 2 different hospital-level analyses to examine the level of agreement between the risk-adjusted measures of hospital quality based on all patients vs subgroups of injury patients. We assessed the level of agreement for (1) adjusted odds ratios using the intraclass correlation coefficient and (2) categorical measures of hospital quality (high, intermediate, and low quality) using the κ statistic. All statistical analyses were performed using STATA SE/MP statistical software, version 10.0 (StataCorp, College Station, Texas).
The study cohort consisted of 54 859 patients in 44 hospitals. The median age was 40 years, and 65% were male. Thirty-seven percent of the patients had sustained blunt trauma, 29% had been in motor vehicle accidents, 13% had experienced low falls, 11% had experienced pedestrian trauma, 7% had sustained gunshot wounds, and 2% had experienced stab injuries. The overall mortality rate was 4%.
The discrimination (C statistic) and calibration (Hosmer-Lemeshow statistic) of the global injury model and of each of the injury-specific models are given in Table 1. All models had good-to-excellent discrimination. Model calibration was acceptable given the size of the data sets.
Part of a sample report card sent to 1 of the SMARTT hospitals that coded injuries using AIS codes is shown in Figure 1. A similar report card was sent to hospitals that used ICD-9-CM codes for injury coding on the basis of an ICD-9-CM–based verion of the TMPM (not shown). Figure 2 shows the level of agreement between the hospital-adjusted odds ratios based on all injuries vs specific injury mechanisms. The 45° line represents the identity line; hospitals that fall on this line have identical odds ratios based on all trauma cases vs a specific injury mechanism. Each graph is divided into 4 quadrants. Hospitals that fall into the left-upper quadrant or right-lower quadrant have adjusted odds ratios that suggest discordant hospital performance in treatment of all patients vs a subgroup of trauma patients. In all 5 comparisons, only a few hospitals had discordant findings for quality. Most hospitals with odds ratios greater than 1 for all trauma tended to have odds ratios greater than 1 for injury-specific measures. Similarly, most hospitals with odds ratios less than 1 for all trauma also tended to have odds ratios less than 1 for injury-specific measures.
Table 2 displays the results of these pairwise comparisons based on the intraclass correlation coefficient and the κ statistic. In these analyses, we excluded the lowest-quality hospital because of its extreme outlier status. However, we include this hospital in Figure 2. The analysis of the intraclass correlation coefficients suggests fair-to-substantial agreement in the point estimates for the hospital-adjusted odds ratios. The κ analysis demonstrated poor-to-fair agreement among hospital quality measures when hospital quality was based on outcomes for all trauma patients vs specific subgroups of patients. (The extent of agreement as measured by either the κ statistic or the intraclass correlation coefficient is less than 0.0, poor agreement; 0.01 to 0.20, slight agreement; 0.2 to 0.40, fair agreement; 0.41 to 0.6, moderate agreement; 0.61 to 0.8, substantial agreement; and 0.81 to 1.0, almost-perfect agreement.20) However, none of the hospitals assessed to be high quality for all patients were found to be low quality for specific injury populations. Similarly, none of the hospitals judged to be high quality for all trauma were found to be low quality for specific injury populations.
We found that there is substantial variability in survival of patients hospitalized with traumatic injuries across hospitals. In this cohort of hospitals in the NTDB, the odds of patients dying in the worst-performing hospitals was at least 50% greater than their risk of dying in the average hospital. By contrast, the risk of patients dying at the highest-performing hospitals was 40% less than their risk of dying in an average hospital. Although performance measurement is a “transformative tool” in the quest to improve health care quality, it remains “in a relatively early stage of development and implementation.”21(p1800) However, even at this early stage, we know that from the Veterans Affairs National Surgical Quality Improvement Program that nonpublic reporting of surgical outcomes has contributed to the 47% reduction in 30-day postoperative mortality in Veterans Affairs facilities.22 In light of the significant differences in outcomes across hospitals that care for injured patients, it is likely that there are substantial opportunities for improving overall trauma population outcomes.
Most current outcome report cards provide separate feedback for each surgical procedure,23- 25 although there are efforts under way to design composite performance measures.26 In designing a report card based on the NTDB, we decided to create separate risk-adjusted measures for patients with blunt trauma, gunshot wounds, pedestrian trauma, motor vehicle accident trauma, and severe injury, in addition to a single composite measure based on all trauma patients. Providing hospitals with feedback on patients with specific injury mechanisms may allow hospitals to more effectively target their quality improvement efforts than if they only receive feedback on all injuries in a single composite measure. For example, patients who sustain blunt trauma are more likely than patients with gunshot wounds to be treated nonoperatively. As a result, best practices for patients with blunt trauma may not be the same as those for patients with gunshot wounds. For a specific hospital, average outcomes across all injuries may mask opportunities for improvement in patients with a specific injury mechanism. To test the usefulness of this reporting approach, we assessed the amount of overlap between hospital quality measures based on all trauma patients vs those that reported on only patients with specific injuries.
Our analysis demonstrates fair-to-substantial agreement in the point estimates of hospital performance based on all injuries vs those based on specific injury mechanisms. However, when hospitals were divided into high-performance, average-performance, and low-performance hospitals, the level of agreement was poor to fair. However, none of the hospitals classified as high-quality hospitals, based on all trauma patients, were classified as low quality in any of the specific patient populations. Similarly, none of the hospitals classified as low quality based on data from all trauma patients was classified as high quality for specific patient populations. These findings suggest that a single composite measure based on all injured patients may not capture all the differences in hospital quality across different populations of injured patients. It is reasonable to assume that feedback on overall outcomes, combined with feedback on specific injury groups, is more likely to stimulate focused interventions than presentation of feedback only on all trauma cases. Composite measures, on the other hand, may be useful insofar as they present a summary measure of performance that may be easier to interpret than separate measures.
Other studies that involve patients who had undergone general, thoracic, oncologic, vascular,10 and cardiac surgery9 and patients with acute myocardial infarction, congestive heart failure, pneumonia, stroke, obstructive lung disease, or gastrointestinal hemorrhage27 also suggest that hospital performance is not uniform across all types of high-risk surgery or medical diagnoses. To our knowledge, these findings have not been previously reported for trauma. The observed correlation may result from the fact that, in some cases, different procedures and medical diagnoses share the same hospital structural characteristics (ie, staffing and infrastructure) and processes of care (ie, protocols and critical pathways). However, the lack of perfect correlation also suggests that some hospital characteristics that influence patient outcomes are procedure and disease specific. This factor argues against the sole use of composite outcome measures for surgical procedures and medical diagnoses in which condition-specific patient volumes are sufficient to provide meaningful estimates of hospital performance.
The primary limitation of this study is that it does not explore the association between providing hospitals with injury-specific information and improved outcomes. It is the primary aim of the SMARTT study to determine whether nonpublic reporting of trauma outcomes is associated with improved trauma population outcomes. However, our study design does not allow us to investigate whether providing hospitals with a single composite outcome measure has the same impact on outcomes as the use of a more comprehensive report that includes injury-specific reports. Instead, the goal of this study was limited to examination of whether injury-specific reports contained information not available in a single mortality measure based on all trauma patients. Another limitation of this analysis is that the observed correlation between the global outcome measure and the quality measure based on a specific injury mechanism will necessarily increase as the relative size of that subgroup increases. This bias could have been easily corrected by measurement of the extent of agreement between report measures based on specific injury mechanisms. However, such an analysis would not have allowed us to examine the correlation between a global measure of performance and performance based on subpopulations of injured patients.
Injury-specific outcomes reports provide information that is not available in a single outcomes measure based on all trauma patients. The added expense of production of injury-specific reports is small compared with the cost of data collection and is justified by the additional opportunities for quality improvement that result from providing hospitals with more specific information on their outcomes.
Correspondence: Laurent G. Glance, MD, Department of Anesthesiology, University of Rochester School of Medicine, 601 Elmwood Ave, Box 604, Rochester, NY 14642 (email@example.com).
Accepted for Publication: November 20, 2008.
Author Contributions:Study concept and design: Glance, Mukamel, Meredith, and Dick. Acquisition of data: Glance and Dick. Analysis and interpretation of data: Glance, Mukamel, and Dick. Drafting of the manuscript: Glance. Critical revision of the manuscript for important intellectual content: Glance, Mukamel, Meredith, and Dick. Statistical analysis: Glance, Mukamel, and Dick. Obtained funding: Glance and Dick. Administrative, technical, and material support: Glance and Meredith. Study supervision: Glance and Meredith.
Financial Disclosure: None reported.
Funding/Support: This project was supported by grant RO1 HS 16737 from the Agency for Healthcare and Quality Research.
Disclaimer: The views presented in this article are those of the authors and may not reflect those of the Agency for Healthcare and Quality Research or of the American College of Surgeons Committee on Trauma.
Additional Contributions: Melanie L. Neal, MS, at the NTDB provided invaluable assistance on this project.