Adjusted odds ratio (OR) of survival with 95% confidence intervals for each trauma center (TC), compared with the best-performing center (TC 142, indicated by baseline numeral 1). Trauma centers are ordered by rank.
Observed vs Expected survival ratios with 95% confidence intervals. Trauma centers are ordered by rank.
Shafi S, Stewart RM, Nathens AB, Friese RS, Frankel H, Gentilello LM. Significant Variations in Mortality Occur at Similarly Designated Trauma Centers. Arch Surg. 2009;144(1):64-68. doi:10.1001/archsurg.2008.509
Mortality rates vary across designated trauma centers (TC), even after controlling for injury severity.
Retrospective analysis of state trauma registry data.
Designated Level 1 and 2 TCs in 2003 in a large Southwestern state.
Adult trauma patients (n = 18 584) treated at 15 designated Level 1 and 2 TCs.
Main Outcome Measures
Risk-adjusted survival was calculated for each trauma center using logistic regression analysis to adjust for differences in age, sex, race, injury mechanism, and injury severity. The model was developed using half of the study population and validated in the remaining half. It was then applied to the entire study population, with inclusion of TC identification codes. Observed vs Expected survival ratios were then calculated for each TC. Adjusted odds ratios (OR) for survival at each TC were also calculated.
Adjusted OR of survival were significantly different from crude OR at 6 of the 14 TCs, underscoring the importance of risk adjustment when performing quality comparisons. One TC performed significantly worse than the others, 8 achieved significantly better survival, and 5 performed the same as the referent. Observed vs Expected ratios demonstrated that one trauma center had significantly worse severity-adjusted outcomes, some were marginal, some performed as well as expected, and none performed better than expectations.
Considerable variations in risk-adjusted mortality rates exist across similarly designated TCs. Such variability in outcomes may reflect variations in quality of care, and reasons for this discrepancy should be explored as the next step in the trauma care quality improvement process.
Institutional variations in care and outcomes are well documented after operations such as coronary artery bypass graft, carotid endarterectomy, and oncologic procedures.1- 4 These variations in outcome have provided the impetus for public reporting of outcomes and protocol-driven care. The leading project in surgical specialties is the National Surgical Quality Improvement Program (NSQIP), which was initiated by the Department of Veteran Affairs in 1994.5- 7 In NSQIP, clinical and patient outcome data are collected in a standardized format to compare observed mortality rates at each participating hospital with the risk-adjusted mortality rate predicted by their case mix. This information is used to identify areas of improvement at centers that do not perform as well as others. Such data are important for driving quality improvement. More recently, the Center for Medicare and Medicaid Services has started the process of pay for performance, in which a certain portion of Medicare payments for physicians and hospitals depend on achieving specific quality benchmarks.8
Trauma centers (TCs) have historically played a leading role in quality improvement in surgical care by establishing a system of designating hospitals based on resources available to care for the injured patient.9 Mortality rates and functional outcomes are improved in patients treated at designated trauma centers compared with undesignated hospitals.10- 13 This improvement is generally attributed to the availability of equipment, personnel, and processes necessary to adequately care for injured patients. The uniformity of the designation process implies that trauma centers with similar designation status provide similar quality of care, thus ensuring similar outcomes. However, this premise has not been validated. We previously demonstrated that the mortality rates of similarly injured patients treated in a nationwide sample of Level 1 trauma centers varied from center to center.14 However, in that study we were only able to control for injury severity using the Injury Severity Score (ISS) and systolic blood pressure (SBP) on presentation to the trauma center. In addition, owing to nationwide sampling, differences in patient populations and trauma systems such as access to trauma centers were not accounted for. We undertook the current study to measure mortality rates at designated Level 1 and 2 trauma centers in a more homogenous population in a single state. Our hypothesis was that mortality rates vary across trauma centers with similar designation status, even after controlling for injury severity and other patient characteristics.
The state of Texas designates hospitals as trauma centers using criteria defined by the American College of Surgeons. A retrospective analysis of data from a statewide trauma registry for the year 2003 was undertaken (n = 73 439, obtained from the Texas Emergency Medical Services/Trauma Registry, Department of State Health Service, www.dshs.state.tx.us/injury). Inclusion criteria consisted of adult patients aged 15 to 99 years treated at designated Level 1 and 2 trauma centers with complete information on survival, age, sex, race, mechanism of injury, ISS, SBP, and Glasgow Coma Scale (GCS) score in the emergency department. These criteria identified 20 794 patients. Trauma centers that admitted fewer than 100 patients were excluded. The final study population consisted of 18 584 patients, of which 10 620 came from 7 Level 1 centers, and 7964 from 8 Level 2 centers. These patients constituted more than 70% of patients submitted to the state trauma registry by the centers included in the analysis. All deaths were included, including those of patients who may have been dead on arrival, as there is no universally agreed-on definition. In addition, a death occurring soon after the arrival of a patient may be related to the quality of care provided, not necessarily to high injury severity.
The outcome of interest was risk-adjusted survival until hospital discharge. Multivariate logistic regression analysis was done to adjust for differences in patient characteristics between trauma centers. Covariates included in regression models included age, sex, race (white vs not white), mechanism of injury (blunt, penetrating, or other), and injury severity. Age was used as a binary variable (≥65 vs <65 years). Three parameters of injury severity were used: (1) ISS as an anatomic measure of overall injury severity, (2) SBP measured in the emergency department as a physiologic measure of injury severity, and (3) GCS score measured in the emergency department as a measure of severity of head injuries. All 3 parameters were used as binary variables to compare severely injured patients at high risk of dying with less severely injured low-risk patients (ISS ≥ 25 vs ISS < 25; SBP > 90 mm Hg vs SBP ≤ 90 mm Hg; GCS score ≤8 vs GCS score >8). The cut-off values for creating these binary variables were chosen to identify severely injured patients who were at high risk of dying and were partly based on our previous analysis.14 We also attempted to use age, ISS, SBP, and GCS score as continuous variables in the risk-adjustment model, but this did not add to the model. Two-way interaction terms between age, ISS, SBP, and GCS score were included. The transfer status of patients was included in the multivariate regression model, but was found to be a statistically insignificant predictor of survival and was subsequently removed. All predictors and interaction terms were entered into the model using the backward elimination technique.
Patients were randomly split into 2 equally-sized data sets, a training set and a validation set. The training set was used to develop a predictive model. The model developed was a good predictor of survival (pseudo r2 = 0.61; Hosmer-Lemeshow statistic P = .88; area under receiver operating characteristic [ROC] curve = 0.95; 95% confidence interval [CI], 0.94-0.96). Next, this model was applied to the data within the validation set. The model performed well in this group of patients as well (pseudo r2 = 0.57; Hosmer-Lemeshow statistic P = .34; area under ROC curve = 0.94; 95% CI, 0.93-0.97). Finally, the model was applied to the entire study population with the addition of trauma center identification codes as covariates. The final model performed well in the entire study population (pseudo r2 = 0.59; Hosmer-Lemeshow statistic P = .28; area under ROC curve = 0.95; 95% CI, 0.94-0.96).
The risk-adjusted model was used to undertake 2 separate analyses. In the first analysis Observed vs Expected survival ratios (O:E) with 95% CI were calculated for each center. These were obtained by dividing the mean observed survival rate at each trauma center to the mean survival predicted by the risk-adjustment model. This approach compared each TC with itself. The O:E ratios less than 1 indicate that trauma center performance was worse than predicted by their patient mix. In the second analysis, the best-performing trauma center (TC 142) was used as the referent. Odds of survival at each center were compared with the referent center with adjusted odds ratios (OR) using the risk-adjustment model described above. This approach compared each TC to the best-performing center. Odds ratios less than 1 indicated that risk-adjusted survival was worse than that of the referent center and vice versa. Statistical software SPSS for Windows was used for all analyses (SPSS Inc, Chicago, Illinois), with P < .05 considered significant.
There were important differences in patient characteristics between the trauma centers for risk factors such as age, mechanism of injury, ISS, SBP, and GCS score (Table 1). Logistic regression revealed that age, mechanism of injury, ISS, initial SBP and GCS scores measured in the emergency department and individual trauma centers were independent predictors of mortality (P < .001). Compared with TC 142, which achieved the best risk-adjusted survival, 8 centers demonstrated significantly worse odds of survival and 6 performed as well as TC 142 (Figure 1; Table 2). The O:E ratios demonstrated that 1 trauma center had significantly worse severity-adjusted outcomes, some TC were marginal, others performed as well as expected, and no TC performed significantly better than expected by their patient mix (Figure 2).
Trauma centers comprise the backbone of our statewide trauma systems. A large amount of resources are spent by hundreds of individual hospitals and state-designating authorities to ensure that comparable resources are available at all designated trauma centers to care for injured patients. Our results suggest that despite similar resources and designation status, all trauma centers do not achieve similar outcomes. This is contrary to the expectation underlying the designation process that it ensures injured patients similar levels of care at similarly designated hospitals. Our results imply that the quality of trauma care is not consistent across similarly designated trauma centers despite the use of the designation process.
There are several important findings in this study that should be noted. We have demonstrated the importance of risk adjustment when comparing outcomes at different centers. The use of crude survival rates to measure the performance of trauma centers is flawed, as it depends on each patient's individual characteristics, such as age, sex, race, injury severity, and other factors as much as it depends on the quality of care provided at trauma centers.15,16 In this study, risk adjustment led to a change in the performance estimate of almost half of the trauma centers. It was interesting that significant predictors of survival identified were similar to those included in Trauma Injury Severity Scale (TRISS) methodology defined by Boyd and colleagues.17 However, even after adjusting for differences in their patient populations, trauma centers where patients were treated were independent predictors of outcomes in our study.
We undertook 2 types of comparison. The O:E ratios were used to compare each center's actual performance with the one expected given their patients' unique characteristics, while OR were used to compare trauma centers with each other, adjusted for differences in their patient populations. The O:E ratios identified one trauma center as having significantly worse outcomes than expected based on its patient mix. This same trauma center was also the worst performer when compared with other trauma centers using OR. The O:E ratio did not identify any center that performed significantly better than expected. However, comparison of OR indicated that several centers performed better than others. Comparison of Observed vs Expected mortality is the approach used by NSQIP in characterizing the performance of Veterans Affairs hospitals.6 Our results indicate that this approach is somewhat limited, and a more comprehensive picture may be obtained by comparing each institution with similar institutions elsewhere.
The most important finding of our study was that similarly designated trauma centers do not achieve similar outcomes. We used a statewide complement of Level 1 and Level 2 trauma centers, as the designation requirements for the 2 types of centers are very similar. The designation process is designed to ensure that designated centers have similar personnel, supplies, equipment, and procedures. Our results indicate that despite the presumed availability of similar resources, all designated trauma centers do not achieve similar results. These findings suggest a need for incorporating patient outcomes into the trauma center designation process.
There are 2 possible explanations why similarly designated trauma centers do not achieve similar results. First, the designation process may not include all factors that influence patient outcome. In other words, the current criteria used in the designation process may not measure all of the resources needed to achieve best possible survival rates. Alternatively, it is possible that the factors that do influence outcome are not measured by the designation process. Either way, it appears that the designation process could be improved if the factors associated with differences in outcome can be identified and included in the process. Finally, these variations in outcome at designated trauma centers suggest significant variations in quality of care. Our findings suggest that one potential means of crossing this quality chasm is to compare trauma centers that perform better than their peers with ones that do not, as shown in Figure 2, to identify trauma center characteristics and processes that are associated with better patient outcomes. Such characteristics can then be promoted to improve the quality of care at the worse-performing centers. This approach has been used successfully by NSQIP.6,18 It may also be useful to incorporate risk-adjusted performance measurements in the trauma center designation process.
This study has a few shortcomings that should be acknowledged. First are the limitations of a retrospective analysis of a statewide registry, such as inability to determine if the GCS measurement was altered by sedative or paralytic medications or endotracheal intubation. In particular, data on preexisting comorbid conditions, which have been shown to affect the outcomes of surgical and trauma patients, were limited and missing for most patients.16,19 Other potential confounders were prehospital times and interventions, insurance status, duration of trauma center designation, annual trauma center volume, and long-term postdischarge outcomes. We did not use other injury severity measures such as the New Injury Severity Score (NISS) or International Classification of Disease–based Injury Severity Score (ICISS), as we did not have the raw data to calculate these scores and we wanted to use more commonly used measures of injury severity.20,21 Also, there was wide variation in the number of patients seen at trauma centers included in the study. Hence, our risk-adjustment model may be disproportionately weighted by the experiences of trauma centers with larger patient populations. We excluded trauma centers where fewer than 100 patients met our inclusion criteria. In addition, we identified centers where 1000 or more patients met the study inclusion criteria and labeled them high-volume centers. However, this variable was statistically insignificant in our predictive model and subsequently removed. It is likely that volume was not an important predictor of outcome in this study, as all patients were drawn from designated Level 1 and 2 trauma centers with similar volume criteria for designation. Exclusion of a large number of patients owing to incomplete information is a major shortcoming of this study. We only included patients with complete information on the critical variables of survival, age, GCS score, ISS, and SBP. One approach to addressing this problem would be to impute missing data. However, we did not feel we had enough information on missing patients' characteristics to be able to undertake imputations in a statistically valid fashion. It is also important to point out that the purpose of this study was not to establish statewide norms for trauma center performance, as that would require obtaining a truly representative sample of trauma patients from around the state.
In conclusion, this study demonstrates that risk-adjusted outcomes of similarly designated trauma centers differ widely despite the presumed availability of similar resources. Prospective focused studies are needed to confirm these variations in quality of care. Once confirmed, reasons for this discrepancy should be explored to improve the quality of care at all trauma centers.
Correspondence: Shahid Shafi, MD, MPH, Department of Surgery, Division of Burn, Trauma, and Surgical Critical Care, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Mail Code 9158, Dallas, TX 75390-9158 (firstname.lastname@example.org).
Accepted for Publication: May 6, 2007.
Author Contributions:Study concept and design: Shafi, Stewart, Friese, Frankel, and Gentilello. Acquisition of data: Shafi. Analysis and interpretation of data: Shafi, Nathens, Friese, Frankel, and Gentilello. Drafting of the manuscript: Shafi and Friese. Critical revision of the manuscript for important intellectual content: Shafi, Stewart, Nathens, Friese, Frankel, and Gentilello. Statistical analysis: Shafi, Nathens, and Friese. Administrative, technical, and material support: Stewart. Study supervision: Frankel and Gentilello.
Financial Disclosure: None reported.
Previous Presentation: Presented at the 114th Annual Meeting of the Western Surgical Association; November 13, 2006; Los Cabos, Mexico.