Missing covariate data do not include participants who were missing Glasgow Coma Scale or pulse oximetry data.
Error bars represent 95% confidence intervals (CIs). Data points are staggered for clarity. If patients with a critical illness score of 6 or higher are collapsed into a single group, the probability of critical illness is 0.76 (95% CI, 0.69-0.82) in the development cohort and 0.78 (95% CI, 0.69-0.85) in the validation cohort.
Seymour CW, Kahn JM, Cooke CR, Watkins TR, Heckbert SR, Rea TD. Prediction of Critical Illness During Out-of-Hospital Emergency Care. JAMA. 2010;304(7):747–754. doi:10.1001/jama.2010.1140
Author Affiliations: Division of Pulmonary and Critical Care Medicine, Harborview Medical Center (Drs Seymour and Watkins), Research Division, Puget Sound Blood Center (Dr Watkins), and Department of Epidemiology (Dr Heckbert) and King County Medic One, Division of General Internal Medicine (Dr Rea), University of Washington, Seattle; Division of Pulmonary, Allergy, and Critical Care, Leonard Davis Institute for Health Economics and Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine, University of Pennsylvania Medical Center, Philadelphia (Dr Kahn); and Division of Pulmonary and Critical Care Medicine and Robert Wood Johnson Clinical Scholar Program, University of Michigan, Ann Arbor (Dr Cooke). Dr Kahn is now with the Clinical Research, Investigation, and Systems Modeling of Acute Illness Laboratory, Department of Critical Care Medicine, University of Pittsburgh School of Medicine, and the Department of Health Policy and Management, University of Pittsburgh Graduate School of Public Health, Pittsburgh, Pennsylvania.
Context Early identification of nontrauma patients in need of critical care services in the emergency setting may improve triage decisions and facilitate regionalization of critical care.
Objectives To determine the out-of-hospital clinical predictors of critical illness and to characterize the performance of a simple score for out-of-hospital prediction of development of critical illness during hospitalization.
Design and Setting Population-based cohort study of an emergency medical services (EMS) system in greater King County, Washington (excluding metropolitan Seattle), that transports to 16 receiving facilities.
Patients Nontrauma, non–cardiac arrest adult patients transported to a hospital by King County EMS from 2002 through 2006. Eligible records with complete data (N = 144 913) were linked to hospital discharge data and randomly split into development (n = 87 266 [60%]) and validation (n = 57 647 [40%]) cohorts.
Main Outcome Measure Development of critical illness, defined as severe sepsis, delivery of mechanical ventilation, or death during hospitalization.
Results Critical illness occurred during hospitalization in 5% of the development (n = 4835) and validation (n = 3121) cohorts. Multivariable predictors of critical illness included older age, lower systolic blood pressure, abnormal respiratory rate, lower Glasgow Coma Scale score, lower pulse oximetry, and nursing home residence during out-of-hospital care (P < .01 for all). When applying a summary critical illness prediction score to the validation cohort (range, 0-8), the area under the receiver operating characteristic curve was 0.77 (95% confidence interval [CI], 0.76-0.78), with satisfactory calibration slope (1.0). Using a score threshold of 4 or higher, sensitivity was 0.22 (95% CI, 0.20-0.23), specificity was 0.98 (95% CI, 0.98-0.98), positive likelihood ratio was 9.8 (95% CI, 8.9-10.6), and negative likelihood ratio was 0.80 (95% CI, 0.79- 0.82). A threshold of 1 or greater for critical illness improved sensitivity (0.98; 95% CI, 0.97-0.98) but reduced specificity (0.17; 95% CI, 0.17-0.17).
Conclusions In a population-based cohort, the score on a prediction rule using out-of-hospital factors was significantly associated with the development of critical illness during hospitalization. This score requires external validation in an independent population.
Hospitals vary widely in quality of critical care.1 Consequently, the outcomes of critically ill patients may be improved by concentrating care at more experienced centers.1- 3 By centralizing patients who are at greater risk of mortality in referral hospitals, regionalized care in critical illness may achieve improvements in outcome similar to trauma networks.4 In 2006, the Institute of Medicine called for a regionalized, coordinated system of emergency care for high-risk patients,5 one in which patients in most need of high-intensity acute care are distributed to centers with the greatest expertise in caring for the critically ill.
Current out-of-hospital triage of noninjured, critically ill patients uses dispatch criteria,6 subjective emergency medical services (EMS) assessments,7,8 coordination by medical command officers,9 and patient preference.10 In specific conditions such as coronary artery disease and stroke, out-of-hospital care providers use objective tools to triage and risk-stratify prehospital patients for early treatment and choice of destination.11- 13 However, these subjective and disease-specific assessments alone may not be sufficient for triage in general populations at risk of critical illness.8,14- 16 Future development of regionalized systems of acute care will require objective, routinely measured predictors that are associated with important clinical end points in a heterogeneous population. An objective triage tool may also identify patients for early treatment by out-of-hospital care providers.
We sought to develop a tool for prediction of critical illness during out-of-hospital care in noninjured, non–cardiac arrest patients. Using a population-based cohort of EMS records linked to hospital discharge data, we hypothesized that objective, out-of-hospital factors could discriminate between patients who were and were not likely to develop critical illness during hospitalization.
We conducted a retrospective cohort study among patients who activated EMS during the 5-year period from 2002 through 2006 in King County, Washington, excluding metropolitan Seattle. King County has a heterogeneous population of 1.7 million persons residing in rural, suburban, and urban areas. Residents are served by a 2-tier EMS system accessed by calling 911. First-tier response is provided by emergency medical technician–fire fighters who provide basic life support (BLS) care. The second tier is provided by paramedics who are trained in advanced life support (ALS) and respond to more severely ill patients based on protocols and assessments by both emergency medical dispatchers and BLS responders. Patients encountered by King County EMS may be transported to 1 of 16 hospitals.
We linked EMS records to the Washington State Comprehensive Hospital Abstract Reporting System (CHARS) database from 2002 through 2007. We excluded patients with traumatic injury and cardiac arrest as determined by EMS documentation. Patients with traumatic injury are already triaged under explicit clinical criteria,17 and patients with cardiac arrest have a near certain likelihood of requiring intensive care unit (ICU) admission.18 Among remaining patients, we included EMS encounters that met the following 3 criteria: patient age of at least 18 years; documentation of out-of-hospital vital signs/physical examination; and transport to a receiving facility. The final sample was randomly allocated into development (60%) and validation (40%) cohorts.
We defined critical illness as severe sepsis, delivery of mechanical ventilation, or death at any point during hospitalization. We use this definition of critical illness rather than simple admission to an ICU, which can be influenced by emergency department disposition, ICU bed availability, and local practice variation. We used a clinically validated, administrative definition for severe sepsis based on International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) diagnosis codes for a bacterial or fungal infectious process and the presence of acute organ dysfunction.19 Because of changes in coding for sepsis, we also included ICD-9-CM codes 995.91 (systemic inflammatory response syndrome due to infectious process without acute organ dysfunction), 995.92 (severe sepsis), and 785.52 (shock without mention of trauma; septic shock) in our definition. We used the ICD-9-CM procedure code (96.7x) to define the need for mechanical ventilation.20 We defined hospital death using discharge disposition in CHARS. We abstracted out-of-hospital clinical data from the King County EMS database, including dispatch, demographic, physical examination, procedure, and transport data. We evaluated only initial out-of-hospital vital signs, documented by first-arriving EMS personnel.
In the King County EMS database, each patient has a BLS record, yet additional responders (both BLS and ALS) may create duplicate records for the same patient incident. No unique identifier is present across these duplicate records. To identify the first responder, we used probabilistic matching to link ALS encounters (n = 106 694) and BLS encounters (n = 436 159) that represent the same patient (LinkPlus software, version 2.0, Centers for Disease Control and Prevention Cancer Division, Atlanta, Georgia).21,22 We successfully matched 90 191 ALS records (85%) to corresponding BLS records and removed unmatched ALS records (15%), which are likely to represent redundant ALS responders. Two King County epidemiologists assessed probabilistic match quality through manual review of 200 ALS-BLS pairs, blinded to probabilistic match outcome, of which 192 (96%) were correctly matched. We then applied exclusion criteria to the final matched data set (n = 436 159), then linked eligible encounters (n = 166 908) using direct identifiers (first/last name, age, sex, receiving hospital, transport/admission date) to hospital discharge data using a hierarchical, deterministic matching algorithm.
We developed a multivariable model for critical illness during hospitalization in 4 steps: (1) assessment of candidate variable quality and categorization of continuous predictors; (2) construction of a parsimonious model; (3) development of a point score; and (4) internal validation in a separate cohort of patients.23,24 When choosing candidate variables for the model, we considered clinical relevance, generalizability (inclusion in the National EMS Information System database),25 and timing of the exposure during out-of-hospital care, in that order. Continuous candidate predictors were categorized using a priori–determined cut points based on clinical relevance and natural distributions. We considered age (<45, 45-64, or ≥65 years), sex, initial systolic blood pressure (≤90, 91-140, 141-180, or >180 mm Hg), initial heart rate (≤60, 61-99, 100-119, or ≥120/min), initial respiratory rate (<12, 12-23, 24-35, or ≥36/min), initial Glasgow Coma Scale score (15, 12-14, 8-11, or <8), initial pulse oximetry (93%-100%, 88%-92%, 80%-87%, or <80%), and out-of-hospital location (nursing home, adult medical facility, home, public building, or street/highway). For simplicity, we did not consider multiplicative or additive interactions as candidate predictors or out-of-hospital procedures that may occur after initial EMS assessments. In our primary analysis, we used single imputation with normal value substitution for variables presumed to be clinically normal if not measured (eg, Glasgow Coma Scale, pulse oximetry), a method previously used for critically ill patients.26 We constructed a multivariable logistic regression model from candidate predictors and used backward selection with the Akaike information criterion to populate the model.27 This measure of model fit penalizes models with a large number of variables and attempts to reduce overfitting.
We assigned a point score to each covariate by rounding regression coefficients in the final model to the nearest integer.28 We generated predicted probabilities of critical illness for each value of the point score using logistic regression with the Huber-White estimator to generate standard errors for regression coefficients. In the validation cohort, we determined overall performance using the Brier score29 and the McKelvey and Zavoina R2 statistic.30 We assessed discrimination using the area under the receiver operating characteristic curve for the composite outcome, as well as each contributing component (death, mechanical ventilation, and severe sepsis).31 We assessed calibration using the Hosmer-Lemeshow statistic with P < .10 indicating that fit was inadequate.32 Since small, clinically insignificant differences in predicted and observed outcomes result in a significant Hosmer-Lemeshow statistic in large samples,33 we also calculated the calibration slope.34 Further details of model performance measures are available in the eAppendix. We calculated the sensitivity, specificity, and positive and negative likelihood ratios for each point score as a threshold and grouped patients into 3 categories by risk of critical illness: low (<10%), intermediate (10%-20%), and high (>20%).
We performed several sensitivity analyses to assess the robustness of our findings. To determine if our handling of missing data introduced bias,35 we assumed data were missing at random and performed multiple imputation for all missing values using a regression switching approach (multiple imputation by chained equations).36 We then repeated both model development and validation steps on imputed data. To determine the performance of a more flexible but complex model, we reanalyzed model fit after rounding regression coefficients to the nearest half integer. Finally, we repeated our analysis using a different definition of critical illness that included diagnostic and procedure codes for cardiac and respiratory arrest, hypotension, shock, cardiopulmonary resuscitation, and acute respiratory failure.37 Additional details of sensitivity analyses are provided in the eAppendix. All analyses were performed with Stata software, version 10.0 (Stata Corp, College Station, Texas). All tests of significance used a 2-sided P ≤ .05. This study was approved by the institutional review boards for the Washington State Department of Health, King County Emergency Medical Services, and University of Washington.
Among the 436 159 unique prehospital patients (Figure 1), the primary reasons for exclusion were traumatic injury (26%), no transport to a hospital (14%), and absence of physical examination by EMS responders (15%). We evaluated 166 908 eligible patients for data quality of candidate predictors (eTable 1) and split complete cases (n=144 913) into development (n = 87 266) and validation (n = 57 647) cohorts. Critical illness during hospitalization occurred in approximately 5% of both the development (n = 4835) and validation (n = 3121) cohorts. Patients experiencing critical illness were older and more likely to be transported from nursing homes, receive ALS care, and present with out-of-hospital respiratory symptoms (Table 1). In general, patients with critical illness during hospitalization presented with greater alterations in initial out-of-hospital vital signs. Among patients with and without critical illness, EMS response and transport intervals were similar, while total time from patient side to departure was greater for both BLS and ALS personnel responding to patients who developed critical illness.
In the development cohort, we identified 8 objective out-of-hospital predictors of critical illness in our final model (Table 2). We used point values generated from the rounded regression coefficients to develop a score. The coefficients for sex and nursing home residence rounded to 0 and did not contribute to our final score (eTable 2). We entered the point total for each patient in a logistic regression model to generate the individual predicted probability of critical illness. For scores ranging from 0 to 8, the mean predicted probabilities of critical illness were 1.2% (95% confidence interval [CI], 1.0%-1.4%), 2.9% (95% CI, 2.7%-3.0%), 6.7% (95% CI, 6.4%-7.1%), 15% (14%-16%), 30% (29%-33%), 52% (47%-56%), 72% (64%-79%), 88% (69%-98%), and 100% (64%-100%), respectively.
In our independent validation sample, the critical illness score demonstrated satisfactory discrimination (area under the receiver operating characteristic curve, 0.77; 95% CI, 0.76-0.78; Brier score, 0.04). The critical illness score had similar performance for each component end point of our primary outcome (hospital mortality, 0.78 [95% CI, 0.77-0.79]; severe sepsis, 0.76 [95% CI, 0.75-0.77]; delivery of mechanical ventilation during hospitalization, 0.81 [95% CI, 0.80-0.82]). Calibration of the model was acceptable at low and intermediate score values but decreased at higher score values (Figure 2). The Hosmer-Lemeshow goodness-of-fit test demonstrated statistical evidence of inadequate fit (χ27 = 47; P < .001), but the calibration slope (1.0) suggested little overfitting. We calculated sensitivity, specificity, and diagnostic likelihood ratios for each critical illness score grouped by low-, intermediate-, and high-risk categories (Table 3). Using a cut point of 4 or higher to identify patients who develop critical illness (>20% expected risk), we observed a sensitivity of 0.22 (95% CI, 0.20-0.23), a specificity of 0.98 (95% CI, 0.98-0.98), a positive likelihood ratio of 9.8 (95% CI, 8.9-10.6), and a negative likelihood ratio of 0.8 (95% CI, 0.79-0.82). If this cut point were used to triage patients to a regional referral center for critical care services, we would transport 1887 patients (3.2%) to the regional center, of whom 1211 (64%) would be mistriaged (ie, brought to the regional center but not subsequently develop critical illness). Of 55 760 patients (97%) brought to nonreferral centers, 2445 (4.4%) would be mistriaged (ie, brought to a nonreferral center but subsequently develop critical illness). Using a threshold of 1 or greater to identify critical illness (>2% expected risk) would increase sensitivity to 0.98 (96% CI, 0.97-0.98) but decrease specificity to 0.17 (95% CI, 0.17-0.17). Triage using this cut point would result in the transport of 48 286 patients (84% of the total) to the regional center, of whom 45 231 (94%) would be mistriaged. Of 9361 patients (16%) brought to nonreferral centers, 66 (1%) would be mistriaged.
In our sensitivity analysis, multivariable logistic regression estimates derived after multiple imputation were similar to our primary analysis in the development cohort (eTable 3). Only a point score between 12 and 14 on the Glasgow Coma Scale decreased from 1 to 0 after rounding. Performance of the imputed model in imputed validation data sets was also similar (eTable 4). We observed comparable results when we derived a point score by assigning half-integers to regression coefficients and used an alternative definition of critical illness (eTable 4).37
We developed and internally validated an out-of-hospital model that predicts critical illness during hospitalization in a heterogeneous medical population. The critical illness score incorporates a small number of objective out-of-hospital variables, such as systolic blood pressure, heart rate, respiratory rate, Glasgow Coma Scale score, and pulse oximetry. We demonstrate the role that simple physiologic assessment can play in risk stratification in the prehospital period among noninjured patients. The model provides an important foundation for future efforts to identify patients at greatest risk of critical illness using information from the out-of-hospital phase of emergency care.
We developed the current model in direct response to recent calls for centralized, coordinated care in patients with acute illness.5,38 A system that regionalizes patients with critical illness to centers with greater hospital admission volume has the potential to significantly improve outcomes but is generally not feasible at this time.39 One major challenge to an optimal regionalized system is identifying which patients will benefit from admission to a regional referral center.4,40 This challenge is particularly salient in the out-of-hospital setting, where variable utilization of EMS,41 heterogeneous reasons for dispatch,42 incomplete information,43 and subjective assessments8 may limit accurate identification of critical illness. Disease-specific tools such as the prehospital 12-lead electrocardiogram44 and Los Angeles or Cincinnati stroke scales7,45 or trauma triage guidelines46 facilitate complex management decisions. Yet they are limited to their disease context and may lead to multiple, overlapping systems with less certain benefits to overall system efficiency. Objective tools that apply to a heterogeneous, noninjured population may improve discrimination of critical illness and potentially better match early treatment with patient needs.
The current model demonstrated good discriminative capacity. Yet the model tended to overidentify critical illness among those judged at high risk and underidentify critical illness among those judged at low risk. These errors in calibration have important implications for the development of regionalized systems of emergency care. Concerns about overwhelming referral centers are a major barrier to regionalization.47 Should the current tool be applied, a significant number of patients would be unnecessarily brought to regional referral centers. Likewise, many patients who subsequently develop critical illness would be brought to hospitals with fewer resources to manage critical illness, reducing the clinical effect of a regionalized system. More accurate models are needed before regionalization can be practically implemented. Although the current investigation provides a necessary framework for prehospital triage, future research should be directed to develop accurate, feasible predictive models, perhaps incorporating biomarker measurement (eg, prehospital lactate level).48 Because complex prediction models are difficult to implement during emergency care,49 information technology solutions may be needed to integrate real-time physiologic data, biomarkers, and clinical decision making. Additionally, novel collection methods may be required to assess some variables in our model.
The current model requires prospective, external validation. With future linking of national EMS databases to hospital outcomes,25 we recommend validation in temporally and geographically distinct populations using contemporary data. Since we used objective measures that are relatively easy to assess and included first vital sign measurements, we believe the model will perform well in other EMS system structures (eg, 1-tier systems). Model variables would also be unaffected by the presence of physicians during out-of-hospital care50 or the longer transport times found in rural settings.51 In addition to prospective validation, randomized trials are needed to evaluate the clinical and economic effect of the triage tool in a real-world setting. Even triage tools that identify critical illness with great accuracy might have unintended, adverse consequences when used in regionalized system. Large, population-based trials that capture a broad range of outcomes for patients triaged to both regional referral and community hospitals are necessary.
We recognize several limitations of our study. We defined critical illness using a composite outcome derived from administrative diagnosis and procedure codes at any time during hospitalization. Although our definition encompasses a broad range of severely ill patients at high risk of death, it may misclassify some patients who either do or do not truly require critical care. For example, some low-risk sepsis patients may be included in our definition while some high-risk patients that do not meet our definition of severe sepsis may not be included. Missing data for Glasgow Coma Scale and pulse oximetry was common, either due to absent documentation or lack of assessment. We used simple imputation with normal value substitution, a method commonly used in ICU severity scores.26 Because this method may introduce bias, we performed a sensitivity analysis using multiple imputation that showed minimal change in our model or performance in the validation cohort. We also observed statistically significant evidence of inadequate fit using the Hosmer-Lemeshow statistic, and differences between observed and expected probabilities of critical illness at greater score values.32 Yet the calibration slope analysis confirmed that overfitting was unlikely, and our plot of observed vs expected probabilities demonstrated adequate calibration for low and intermediate scores. We did not evaluate important predictors of critical illness, such as race/ethnicity52 and individual-level socioeconomic status,53 because these were not documented in our data source and could not be objectively assessed at the scene. Finally, we categorized continuous covariates to facilitate usability in future studies, a step that adds a conservative bias to our performance estimates. All predictive models must trade off accuracy for simplicity, and in this case, we favored a simple model for more practical application in the field.
In summary, we developed a prediction rule using best available out-of-hospital data, and the score on the prediction rule was significantly associated with development of critical illness during hospitalization in noninjured patients. Although improved accuracy and external validation are required, this model provides a foundation for future efforts to identify noninjured patients who may benefit from coordinated systems that regionalize emergency care.
Corresponding Author: Christopher W. Seymour, MD, MSc, Division of Pulmonary and Critical Care Medicine, University of Washington, Harborview Medical Center, PO Box 359762, Seattle, WA 98104 (firstname.lastname@example.org).
Author Contributions: Dr Seymour had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Seymour, Kahn, Cooke, Watkins, Rea.
Acquisition of data: Seymour.
Analysis and interpretation of data: Seymour, Cooke, Watkins, Heckbert, Rea.
Drafting of the manuscript: Seymour.
Critical revision of the manuscript for important intellectual content: Kahn, Cooke, Watkins, Heckbert, Rea.
Statistical analysis: Seymour, Kahn, Cooke, Watkins.
Obtained funding: Seymour.
Study supervision: Kahn, Heckbert, Rea.
Financial Disclosures: None reported.
Funding/Support: Dr Seymour is supported in part by extramural training grant T32 NIH/HL07287 from the National Institutes of Health (NIH) and by National Center for Research Resources grant KL2 RR025015 from the NIH. Dr Kahn is supported by a career development award K23 HL082650 from the NIH. Dr Cooke is supported by the Robert Wood Johnson Foundation Clinical Scholars Program. Dr Watkins is supported by career development award K23 GM086729-01 from the NIH. Dr Heckbert is supported by grant 5UL1RR0205014 from the NIH. Dr Rea is supported by grants R01 HL088576-01A1 and R01 HL074098-01A1 from the NIH.
Role of the Sponsors: The funding agencies had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; or preparation, review, or approval of the manuscript.
Additional Contributions: We thank Bill O’Brien, Department of Epidemiology, University of Washington, for his assistance in data programming and record linkage. He did not receive compensation beyond his normal salary.