Very light blue boxes denote criteria for splitting patients in groups of different risk. Blue indicates predicted risk less than 1 per 1000; white, predicted risk 1 per 1000 or greater to less than 3 per 100; and light yellow, predicted risk 3 per 100 or greater.
eTable 1. Predictors Included in the Development of the Optimal Classification Trees
eTable 2. Patients With ciTBI Who Were Erroneously Classified as Very Low Risk With the PECARN Rules or the Optimal Classification Trees
Customize your JAMA Network experience by selecting one or more topics from the list below.
Bertsimas D, Dunn J, Steele DW, Trikalinos TA, Wang Y. Comparison of Machine Learning Optimal Classification Trees With the Pediatric Emergency Care Applied Research Network Head Trauma Decision Rules. JAMA Pediatr. 2019;173(7):648–656. doi:10.1001/jamapediatrics.2019.1068
Can machine learning improve the Pediatric Emergency Care Applied Research Network (PECARN) rules’ predictive accuracy to identify children at very low, intermediate, and high risk of clinically important traumatic brain injury?
In this cohort study of 42 412 children with head trauma, reanalysis of data from the PECARN group empirically suggests that novel machine-learning (optimal classification tree)–based rules perform as well as or better than the PECARN rules in identifying more children at very low risk of clinically important traumatic brain injury without missing more patients with clinically important traumatic brain injury.
If implemented in the electronic health record, the new rules may help reduce the number of unnecessary computed tomographic imaging scans, without missing more patients with clinically important traumatic brain injury than the PECARN rules.
Computed tomographic (CT) scanning is the standard for the rapid diagnosis of intracranial injury, but it is costly and exposes patients to ionizing radiation. The Pediatric Emergency Care Applied Research Network (PECARN) rules for identifying children with minor head trauma who are at very low risk of clinically important traumatic brain injury (ciTBI) are widely used to triage CT imaging.
To examine whether optimal classification trees (OCTs), which are novel machine-learning classifiers, improve on PECARN rules’ predictive accuracy.
Design, Setting, and Participants
A secondary analysis of prospective, publicly available data on emergency department visits for head trauma used by the PECARN group to develop their tool was conducted to derive OCT-based prediction rules for ciTBI in a development cohort and compare their predictive performance vs the PECARN rules in a validation cohort among children who were younger than 2 years and 2 years or older. Data on 42 412 children with head trauma and without severely altered mental status who were examined between June 1, 2004, and September 30, 2006, were gathered from 25 emergency departments in North America participating in PECARN. Data analysis was conducted from September 15, 2016, to December 18, 2018.
Main Outcomes and Measures
The outcome was ciTBI, with predictive performance measured by estimating the sensitivity, specificity, positive predictive value, negative predictive value, positive likelihood ratio, and negative likelihood ratio for the OCT and the PECARN rules. The OCT and PECARN rules’ performance was compared by estimating ratios for each measure.
Of the 42 412 children (15 996 [37.7%] girls) included in the analysis, 10 718 were younger than 2 years (25.3%; mean [SD] age, 11.6 [0.6] months) and 31 694 were 2 years or older (74.7%; age, 9.1 [4.9] years). Compared with PECARN rules, OCTs misclassified 0 vs 1 child with ciTBI in the younger and 10 vs 9 children with ciTBI in the older cohort, and correctly identified more children with very low risk of ciTBI in the younger (7605 vs 5701) and older (20 594 vs 18 134) cohorts. In the validation cohorts, compared with the PECARN rules, the OCTs had statistically significantly better specificity (in the younger cohort: 69.3%; 95% CI, 67.4%-71.2% vs 52.8%; 95% CI, 50.8%-54.9%; in the older cohort: 65.6%; 95% CI, 64.5%-66.8% vs 57.6%; 95% CI, 56.4%-58.8%), positive predictive value (odds ratios, 1.54; 95% CI, 1.36-1.74 and 1.23; 95% CI, 1.17-1.30, in younger and older children, respectively), and positive likelihood ratio (risk ratios, 1.54; 95% CI, 1.36-1.74 and 1.23; 95% CI, 1.17-1.30, in younger and older children, respectively). There were no statistically significant differences in the sensitivity, negative predictive value, and negative likelihood ratio between the 2 sets of rules.
Conclusions and Relevance
If implemented, OCTs may help reduce the number of unnecessary CT scans, without missing more patients with ciTBI than the PECARN rules.
Between 2006 and 2010, an estimated 750 000 emergency department visits pertained to pediatric head trauma.1,2 Because most head trauma in children appears to be minor,2,3 it is challenging to identify clinically important intracranial injuries that necessitate immediate intervention or close observation. Currently, computed tomography (CT) is the standard for the rapid diagnosis of intracranial injury,4 but it is costly, may require sedation, and exposes patients to ionizing radiation that may increase the risk of cancer later in life.5-7 However, patients without substantially altered mental status, for example, with Glasgow Coma Scale (GCS) scores of 14 or 15, rarely have experienced clinically important traumatic brain injury (ciTBI) or have evidence of intracranial injury with CT imaging.8 Avoiding needless CT scans in such patients is desirable. To this end, the Pediatric Emergency Care Applied Research Network (PECARN) has developed and validated rules for identifying which children with head trauma but without substantially altered mental status are at very low risk of ciTBI and should not receive head CT imaging.8 The PECARN rules are widely used9,10 and have been independently validated.11,12
The PECARN rules are easy to memorize and apply, but their simplicity may come at a price in terms of maximum attainable predictive accuracy. Having easy-to-memorize predictive tools is desirable but hardly necessary, given the wide availability of health information technologies that make available clinical decision support tools in emergency departments in developed countries. We capitalized on cutting-edge developments in machine learning and mathematical optimization13,14 to develop and validate tools that can be implemented in electronic health record systems and perform at least as well as or better than the PECARN rules in identifying children at very low risk of ciTBI.
We analyzed a publicly available data set of a prospective cohort of 42 412 children with head trauma and without severely altered mental status who were examined between June 1, 2004, and September 30, 2006, in emergency departments in North America participating in PECARN.8 Data analysis was conducted from September 15, 2016, to December 18, 2018. The mean (SD) age was 7.1 (5.5) years (<2 years: 11.6 [0.6] months, ≥2 years: 9.1 [4.9] years), ranging from 0 to 18 years. A total of 6263 children (14.8%) sustained injuries with severe mechanisms, 37 961 children (89.5%) had isolated head trauma, and 41 071 children (96.8%) had a GCS score of 15 (unaltered mental status). This reanalysis of anonymized data was deemed exempt from review by the Brown University and Massachusetts Institute of Technology institutional review boards.
The PECARN group used this data set to develop and validate a tool to identify children at very low, low, and higher risk of ciTBI. Eligible patients presented to the emergency departments within 24 hours of a head trauma. Patients who underwent imaging before admission, had trivial injury mechanisms, or had conditions complicating assessment (eg, known brain tumors) were excluded. Patients with GCS scores of 13 or less, ventricular shunts, or bleeding disorders were also excluded. The data set is described by Kuppermann et al.8
We followed the original analysis and stratified the data set into 10 718 younger (25.3%) (<2 years and predominantly nonverbal) and 31 694 older (74.7%) (≥2 years and predominantly verbal) patient strata because the evaluation of preverbal and verbal children is fundamentally different.8 Because the data set was anonymized, we could not use the same development and validation cohorts as in the original analysis. Therefore, we randomly split patients into classifier development (younger, n = 8502; older, n = 25 283) and validation (younger, n = 2216 and older, n = 6411) cohorts. The outcome of interest was ciTBI, defined a priori as death from TBI, neurosurgery, intubation for more than 24 hours, or hospital admission for at least 2 nights in patients with TBI-related CT scan findings.
We considered as predictors age, sex, injury severity, loss of consciousness, seizure onset and duration, headache intensity, number of vomiting episodes, altered metal status, skull fracture, and hematoma size and location, according to the definitions in eTable 1 in the Supplement. All predictors are evaluable at presentation. We excluded assessments of dizziness because they have insufficient interobserver agreement.15 Following the PECARN rules, we considered children who had a GSC score of 14 and agitation, somnolence, repetitive questioning, or slow responses to verbal communication as having altered mental status.
In the original data set, 980 of 6465 children (15.2%) with an altered mental status received such a designation for what were considered other reasons. Because we could not operationalize this description, we did not consider the 980 children as exhibiting altered mental status, which should adversely affect our predictive instruments. We developed optimal classification trees (OCTs) using the method developed by Bertsimas and Dunn.13 Optimal classification trees are classification trees analogous to the classification and regression trees (CARTs)16 that were used to derive the original PECARN rules but that are fit with a novel method (mixed integer optimization) that provably outperforms the classical CART-fitting algorithms.13 Optimal classification trees were tuned to be approximately as likely to miss a ciTBI case as the original PECARN rules are. We weighted true-positives 500 times more than false-positives.
The PECARN rules group children in 3 risk categories (very low, low, and higher) (Table 1) for which PECARN offers decision rules. For the very low-risk category, which corresponds to mean predicted risks less than 0.02% for children younger than 2 years and less than 0.05% for children 2 years or older, PECARN recommends management without CT imaging. For children in the higher-risk category, where the mean predicted risk of ciTBI is approximately 4.3% to 4.4% in both strata, PECARN recommends CT imaging. PECARN recommends that evaluation of the remaining children, whose average risk is 0.8% to 0.9% in the 2 groups, be managed with observation vs CT imaging depending on clinician experience, parental preferences, and isolated vs multiple findings.8 Optimal classification trees can give predicted ciTBI risk for each child. We categorized OCT risk predictions in 3 groups of very low, low, and higher risk using as cutoffs predicted risks of 1 in 1000 and 3 in 100 in both age strata. These cutoffs are concordant with the predicted risk levels at which the PECARN rules switch between risk categories. Because the discriminatory ability of the OCTs and the original PECARN rules differs, the average OCT-predicted risks in each category differ. The average OCT-predicted risks were less than 0.05% in the very low-risk, 1.1% to 1.5% in the low-risk, and greater than 4% in the higher-risk categories.
The primary goal of the tool is to correctly identify children at very low risk. For each classifier, we estimated the sensitivity, specificity, positive and negative predictive values, and positive and negative likelihood ratios in the validation cohorts using the very low-risk cutoff. For each metric, we compared OCTs and the PECARN rules by estimating odds ratios (ORs) of probability metrics and ratios of likelihood ratio metrics accounting for the paired nature of the observations. All 95% CIs are 2-sided. Statistical significance was assessed at the α = .05 level.
All 42 412 patients had outcome data. For each predictor listed in eTable 1 in the Supplement, we examined whether missing a value was associated with the outcome using logistic regressions. There was no evidence of strong associations between the outcome and missingness in various predictors, suggesting that data are missing completely at random.17 For variables that could not be assessed by the clinician at presentation (eg, duration of seizures or loss of consciousness), we assigned missing values to a separate category of unknown or unknowable at presentation. For variables that could be assessed by the clinician at presentation, we imputed missing values using a flexible framework that applies formal optimization to impute missing values.14 This framework has resulted in substantial improvement in imputation accuracy over typical approaches, such as the expectation-maximization algorithm and predictive mean matching, in many real-world data sets.14
In sensitivity analyses, we examined whether results changed when only complete cases (children with no missing values) were used. We also compared OCTs vs CARTs that we developed de novo in the development data set using the same approach as in the Kuppermann et al study.8 All results from sensitivity analyses were congruent with those from the main analysis and are not shown. Analyses were programmed in Julia, version 0.5.018; R, version 3.3.119; JAGS, version 4.4.020; and Stata, version 15.21
Compared with the PECARN rules, the corresponding OCT in Figure 1 identified 33% more children younger than 2 years and predominantly nonverbal as being at very low risk (PECARN, n = 7605 vs OCT, n = 5702) (Table 2). For example, a 12-month-old child who fell from a height of 1 m (a high-severity injury mechanism) and has an isolated finding of a frontal hematoma would be considered at very low risk by the OCTs but would be recommended for observation vs CT scans at the discretion of the clinician with the PECARN rules. However, had this child been younger than 6 months, the OCT prediction would be of low risk and would agree with the PECARN classification. Conversely, as reported in Table 2, the OCTs place 32% fewer patients than the PECARN rules in the higher-risk category (1013 vs 1492 in the whole data set), in which CT scanning is recommended, while correctly including a similar number of patients with ciTBI in the higher-risk category (67 vs 66).
The OCTs correctly identified all patients with ciTBI in both the development and validation cohorts as not being in the very low-risk group. The PECARN rules misclassified as very low risk a newborn who was hospitalized for at least 2 nights and had evidence of TBI on CT imaging. The child had no recorded suggestive signs but was reported to have been injured with a moderate severity injury mechanism (patient A in eTable 2 in the Supplement).
Table 3 compares the predictive performance of OCTs vs the PECARN rules’ classification of children in very low-risk vs low- or higher-risk groups. Compared with the PECARN rules, OCT had statistically significantly higher specificity (difference of 18.3%; OR, 2.22; 95% CI, 2.13-2.31 in the development cohort and 16.5%; OR, 2.02; 95% CI, 1.87-2.18 in the validation cohort), and significantly more favorable positive predictive value (ORs of 1.68; 95% CI, 1.59-1.78 and 1.54; 95% CI, 1.36-1.74, in the development and validation cohorts, respectively) and positive likelihood ratios (risk ratios of 1.68; 95% CI, 1.59-1.78 and 1.54; 95% CI, 1.36-1.74, in the development and validation cohorts, respectively). As reported in Table 3, OCTs and PECARN were not statistically significantly different in terms of sensitivity, negative predictive values, and negative likelihood ratios, as the 95% CIs for their relative effects include 1; however, the point estimates for the differences favored the OCTs.
Among children who were 2 years or older and predominantly verbal, the OCTs (Figure 2) identified 14% more children without ciTBI in the very low-risk stratum compared with the PECARN rules (20 604 vs 18 143, respectively) (Table 2). Conversely, the OCTs identified 8% fewer patients than the PECARN rules in the higher-risk category, where CT imaging is recommended.
In the older stratum, the PECARN rules missed 9 and the OCT rules missed 10 of 278 patients with ciTBI (proportionally split in the development and validation cohorts) (Table 2). A detailed description of the misclassified cases with each predictive instrument is reported in eTable 2 in the Supplement.
The OCTs had statistically significantly higher specificity (difference of 7.7%; OR, 1.39; 95% CI, 1.37-1.42, in the development and 8.0%; OR, 1.41; 95% CI, 1.36-1.46, in the validation cohort) and significantly more favorable positive predictive value (ORs of 1.22; 95% CI, 1.18-1.26 and 1.23; 95% CI, 1.17-1.30, in the development and validation cohorts, respectively) and positive likelihood ratios (risk ratios of 1.22; 95% CI, 1.18-1.26 and 1.23; 95% CI, 1.17-1.30, in the development and validation cohorts, respectively) (Table 3). Optimal classification trees and PECARN were not statistically significantly different in terms of sensitivity, negative predictive values, and negative likelihood ratios, as, in Table 3, the 95% CIs for the relative effects include 1. The point estimates of the relative effects did not consistently favor either tool.
Traumatic brain injury has been deemed a serious public health concern by the Centers for Disease Control and Prevention.3 In the United States, public awareness about the importance and long-term implications of head trauma has increased, and parents and guardians are more likely to bring children with head trauma to the emergency department for evaluation.2 In children who have very low risk of ciTBI, avoiding unnecessary CT imaging reduces costs and the risk of long-term radiation-induced cancer.5-7 The PECARN rules were fine tuned to identify children at very low risk of ciTBI in whom CT imaging is unlikely to change patient management22 and is not recommended.8 We developed novel predictive algorithms with the aim to improve on the success of the PECARN rules.
To examine the clinical utility of the OCTs, we first comment on their predictive accuracy vs that of the PECARN rules. However, the sensitivity, specificity, and other measures of predictive accuracy are only surrogate outcomes; improvements in predictive accuracy will not benefit patients unless the new tools are used in practice and lead to changes in diagnostic thinking and patient management.23-25 We turn to these considerations next.
Prediction instruments trade off sensitivity, which is the ability to correctly identify patients with a condition (herein, ciTBI), and specificity, which is the ability to correctly identify patients who do not have that condition. We presented OCTs that appear to have a better predictive accuracy than PECARN in that they have similar sensitivity but improved specificity. The improvement in specificity was statistically significant and sizable. We also believe that the improvement is clinically important and economically consequential.
The improvements in specificity were more pronounced (at least 16% or by an OR of 2) among children younger than 2 years, who are predominantly preverbal and for whom concerns about radiation exposure may be greater compared with concerns in older children. The improvements in specificity among older children were approximately half of those (at least 7% or by an OR of 1.4), yet they still resulted in 8% more children correctly identified in the very low-risk group, potentially decreasing use of CT imaging.
Comparisons between OCTs and the PECARN rules with respect to other measures are congruent with the notion that OCTs have better predictive accuracy and are complementary to the comparisons of sensitivity and specificity. However, interpreting the magnitude of the differences in other measures is not always straightforward. For example, both the negative predictive value (the probability that a child predicted to be at very low risk does not have ciTBI) and positive predictive value (the probability that a child predicted to be at low or higher risk has ciTBI) depend on the prevalence of ciTBI, which was less than 1% in this sample. In the extreme, a prediction rule that considers all children to be at very low risk would have a negative predictive value and positive predictive value within 2% of that of the PECARN rules and the OCTs. Comparing predictive rules on the basis of different negative or positive likelihood ratios, which express the relative change in the initial odds of ciTBI conferred by a prediction of very low risk (or of low or higher risk for positive likelihood ratio), can also be challenging.26
Predictive tools are more likely to be adopted in clinical practice if they are simpler, have face validity, and are easy to apply. The PECARN rules given in Table 1 are a translation of a prediction instrument (a CART tool8) into clinical decision rules that are simple and easy to memorize.27 The price for this simplicity is that the PECARN rules have measurably lower predictive accuracy than the OCTs, as discussed above, and provide only a coarse risk stratification, which is not sufficiently granular for the downstream management of a substantial proportion of children. Specifically, the PECARN rules invite clinicians to further stratify children in the low-risk category into those with risk substantially lower than 1% vs not, based on the presence of isolated loss of consciousness,28 isolated headache,29 isolated vomiting,30 or certain types of scalp hematoma for children older than 3 months.31,32 Relying on isolated findings for risk stratification is an inefficient heuristic because it amounts to considering only the main effects of some predictors, which are obtained from separate analyses, and ignoring interactions. By contrast, the OCTs consider all predictors and their interactions and give more-granular risk predictions (Figure 1 and Figure 2). For these reasons, it is debatable whether the PECARN rules’ apparent simplicity is always worth its price.
Clinicians trust and use predictive rules that have face validity.27 The OCTs shown in Figure 1 and Figure 2 use the same factors as the PECARN rules and make splits that are congruent with previous analyses28-36 and with clinical intuition. With respect to ease of application, although the OCTs use the same clinical information as the PECARN rules, the OCTs are not easy to memorize and should be implemented as a tool in the electronic health record. This lower level of simplicity may limit use of OCTs in general emergency departments, where the most children are seen. We provide a web tool to demonstrate that, once implemented, OCTs are easy to use.37
The clinical utility of predictive instruments for minor head trauma has been examined in various cost-effectiveness analyses, which generally favor the implementation of predictive instruments.38-40 In a randomized trial, guardians who used the decision aid based on the PECARN rules had greater knowledge, less decisional conflict, and reported increased participation in decision-making compared with guardians who were not exposed to the tool.41 However, to our knowledge, there is no evidence of an overall reduction in CT imaging since the introduction of the PECARN rules.42 Similarly, it is not clear whether implementing the OCTs would change clinical and economic outcomes.
There are limitations to the study. We developed the OCTs using state-of-the-art approaches, but we have not validated their predictive performance outside the PECARN cohort. Internal validation efforts are not foolproof substitutes for completely independent empirical validation of the tool in clinical practice, as has been done with the original PECARN rules.11,12 The fact that the OCTs’ predictive performance did not degrade appreciably in the validation set may portend a successful validation in an external data set. The OCTs should be empirically evaluated in a prospective implementation to measure whether they can actually improve outcomes.
Optimal classification tree–based rules may have better predictive performance and provide personalized and more-granular risk predictions than the PECARN rules. However, OCTs are inherently more complicated than the PECARN rules because they include more predictors (eg, age), encode predictors in several levels instead of dichotomizing them, and examine interactions between predictors. In practice, OCTs would have to be integrated into the electronic health record to provide real-time personalized risk predictions. The OCTs are an alternative to the PECARN rules. Clinicians who are partial to using decision rules might value the simplicity of the PECARN rules and the accompanying treatment recommendations.8 Clinicians who are not willing to adopt the risk-benefit tradeoffs implied by the PECARN rules may prefer to obtain risk predictions and make decisions according to their own and the guardians’ preferences and risk attitudes.27,43 We surmise that large health systems that aim to optimize operations by capitalizing on better predictive performance would consider easy-to-use implementations of the OCTs in their systems.
Accepted for Publication: January 29, 2019.
Corresponding Author: Thomas A. Trikalinos, MD, Center for Evidence Synthesis in Health, Brown University, 121 S Main St, Providence, RI 02912 (email@example.com).
Published Online: May 13, 2019. doi:10.1001/jamapediatrics.2019.1068
Author Contributions: Dr Trikalinos and Mr Wang had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: All authors.
Acquisition, analysis, or interpretation of data: Dunn, Steele, Trikalinos, Wang.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Bertsimas, Dunn, Trikalinos, Wang.
Obtained funding: Bertsimas.
Administrative, technical, or material support: Bertsimas, Steele, Trikalinos, Wang.
Supervision: Bertsimas, Trikalinos.
Conflict of Interest Disclosures: None reported.
Disclaimer: We analyzed the Public Use Data Set prepared by the staff at the Data Coordinating Center, University of Utah School of Medicine on behalf of the Pediatric Emergency Care Applied Research Network (PECARN). These data are from the Identification of Children at Very Low Risk of Clinically Important Brain Injuries After Head Trauma: A Prospective Cohort Study (TBI Study). Our analysis and conclusions do not necessarily reflect the opinions or views of the TBI Study investigators or the Health Resources & Services Administration, Maternal Child Health Bureau, or the Emergency Medical Services for Children, which funded the original research.
Create a personal account or sign in to: