To develop a comorbidity model for children that can be used with hospital discharge administrative databases.
Retrospective study using administrative data obtained from the Canadian Institute for Health Information Discharge Abstract Database and the Deaths File to develop a logistic regression model. Hosmer-Lemeshow χ2 test was used to examine model fit. The C statistic was used to assess model discrimination. Bootstrapping was used to determine the stability of regression coefficients.
We used linked administrative databases to compile 339 077 hospital discharge abstracts from April 1, 1991, through March 31, 2002.
Children between ages 1 and 14 years in Ontario, Canada.
Main Outcome Measure
Death within 1 year of hospital discharge.
The 27-variable pediatric comorbidity model predicted 1-year mortality with a C statistic of 0.83 in the Ontario data set from which it was derived. The presence of brain cancer (odds ratio, 76.38 [95% confidence interval, 53.40-109.27]) at hospital admission was the strongest predictor, followed by diabetes insipidus (odds ratio, 39.23 [95% confidence interval, 20.75-74.17]).
Using clinical judgment and empirical modeling strategies, we were able to identify 27 diagnoses highly predictive of death for children between 1 and 14 years of age within 1 year of hospital discharge.
Comparison of health care outcomes across treatments, institutions, or health care professionals requires adjustment for risks, such as comorbidity, that inherently increase or decrease the likelihood of poor health outcomes, thereby allowing more valid comparisons. Comorbidities are defined as acute or chronic conditions unrelated to the principal diagnosis responsible for hospital admission.1- 3 Patients with comorbidities have higher risks of complications and death, are less responsive and tolerant to conventional therapies, and have longer hospital stays and higher resources use.1- 6 Although studied extensively in adults, comorbidity has received little attention in children. Because low proportions of children require long-term care,4,7,8 pediatric risk adjustments have focused mainly on acute care settings that use physiologic indicators, such as the Pediatric Risk of Mortality Score,9 the Clinical Risk Index for Babies,10 and Score for Neonatal Acute Physiology,11 to predict imminent in-hospital death of gravely ill patients. While acute clinical factors are central to evaluating risk for imminent (ie, 30 days) deaths and complications, their usefulness for predicting long-term outcomes (ie, 1 year) is probably limited. The purpose of this study was to identify comorbid conditions related to death within 1 year of hospital discharge for children and adolescents between ages 1 and 14 years. Infants younger than 1 year were omitted because mortality and morbidity in this age group are mostly due to congenital conditions. Children 14 years and older were excluded because beyond this age disease epidemiology is more akin to young adults.
Hospital discharges for children between ages 1 and 14 years in Ontario, Canada, from April 1, 1991, through March 31, 2002, were compiled from the Canadian Institute for Health Information (CIHI) Discharge Abstract Database. The CIHI is a federally chartered institution established to provide health information pertinent for health services research. Abstracts are completed by hospital-based coders after patients leave the hospital, recording diagnoses noted during the entire hospital admission. The Ontario discharge data contain 16 diagnoses and 10 surgical procedures coded using the International Classification of Diseases, Ninth Revision12 (ICD-9) codes. Corresponding diagnosis-type indicators specify whether it is a preexisting comorbidity or a complication. Only diagnoses coded as comorbidities were used to develop our prediction model. In addition, demographic (eg, sex and date of birth) and administrative (eg, hospital admission and discharge dates) information were obtained. According to the Institute for Clinical Evaluative Sciences Practice Atlas13 chapter entitled “A Summary of Studies on the Quality of Health Care in Administrative Databases in Canada,” the documentation of patient demographic variables in the health administrative databases is complete and reliable. Recording of comorbid condition is accurate but may be incomplete. Undercoding is an inherent limitation of all administrative databases.
To determine patient status, we used the Deaths File created by the Institute for Clinical Evaluative Sciences in North York, Ontario. The Deaths File has 1 observation for every resident in Ontario with a valid health card number who is registered dead in either the Registered Persons Database or CIHI record of in-hospital deaths. The Registered Persons Database is an administrative database maintained by the Ministry of Health and Long-Term Care to keep track of all Ontario Health Insurance Plan users. The Registered Persons Database records patient information including birth date, sex, postal code, and death date. The Deaths File is updated whenever new Registered Persons Database or CIHI data become available. Each year between 2% and 3% of deaths are reported only in CIHI. In case of discrepancy, the CIHI death date supersedes.
Using the Institute for Clinical Evaluative Sciences key number, an encrypted form of health card number (a unique patient identifier), we linked the CIHI Discharge Abstract Database with the Deaths File to identify patients' vital status (ie, alive or dead) at 1 year from the date of hospital discharge. Twenty-seven records with death dates preceding those of hospital discharge were deleted. Our decision to study mortality at 1 year was based on the need to extend our observation window to capture more deaths because childhood mortality is a rare outcome. Although factors other than comorbidities may affect vital status in a longer window, we chose 1 year so that our model would have greater utility than one predicting 30- or 60-day mortality.
From this linked database, we identified 541 825 first and recurring hospital discharges from Ontario hospitals for children between 1 and 14 years of age within the 8-year accrual window, from April 1, 1993, to April 1, 2001. To make certain that we included only 1 abstract per inpatient, we used only the index or first hospital discharge in 2 years, enumerating backward from the date of hospital discharge. Based on this criterion, there were 339 077 index hospital discharges. Comparing the numbers of total and index hospital discharges, the volume of readmissions was low.
The 16 diagnosis fields in the linked database were collapsed, yielding 1294 distinct ICD-9 codes recorded for those who died subsequent to their index hospital discharge. Codes were ranked from most to least frequent. The top 200 represented 90.8% of 1-year deaths, whereas the next 200 represented only an additional 4.4% (Figure). If we compiled diagnoses irrespective of the outcome, we would have identified prevalent conditions, such as common cold, that are almost certainly not associated with mortality. Including such diagnoses would render the model with low or no discriminative power.
Cumulative percentage of 1-year deaths represented by the top 400 of 1294 most frequently recorded International Classification of Diseases, Ninth Revision12 codes for the cohort of inpatients (n = 1988) who died following their index hospital discharges.
We aimed to include the fewest number of variables that would still adjust for risks of 1-year mortality adequately, not only because the model is easier to use but also because it prevents the potential peril of overfitting when the ratio of risk factors to outcomes is greater than 10:1.14,15 Both empirical modeling strategies and clinical judgment were used to develop our prediction model. First, a panel of 2 pediatric clinicians (ie, a surgeon and a pediatrician) independently aggregated the 200 ICD-9 codes into homogenous diagnostic categories (eg, seizures and epilepsy). The clinicians convened to reconcile the minor differences between diagnostic groupings. Second, we used the Pearson χ2 test to examine associations between each diagnostic group and death within 1 year of index hospital discharge using a P value <.20. This threshold was chosen because a more lenient cutoff would include too many variables whereas a more stringent one would fail to pick up important ones. Third, we omitted procedure codes that were not diagnoses (eg, kidney donor, V594). Fourth, we deleted diagnosis codes that were not clinically plausible predictors of 1-year mortality (eg, cardiac arrest, 4275). Fifth, diagnostic groups associated with fewer than 15 deaths were removed from further analysis because our aim was to capture diagnoses both relatively frequent and having significant associations with 1-year mortality. The remaining variables were entered into a multivariable logistic regression model, and forward stepwise regression was used to eliminate variables until only those significant at P<.20 were left in the final model. A more stringent P value, such as P = .05, is not recommended because of the risk of excluding important variables from the prediction model.16
The goodness of fit of the final model was examined using the Hosmer-Lemeshow χ2 test. The Hosmer-Lemeshow test measures deviations between the observed and predicted number of cases who were alive or dead within subgroups of similar risks. Under the null hypothesis that prediction within each subgroup is correct, the model is calibrated to the data if the P value is large. The discrimination of the model was calculated by measuring the area under the receiver operating characteristic curve, which is equivalent to the C statistic, ranging from 0 to 1 with 1 indicating a perfect prediction. A prediction model with excellent discrimination generally has a C statistic between 0.8 and 0.9.16 If the model has no ability to discriminate patients who died less than 1 year after hospital discharge from those who did not, the expected value of the C statistic is 0.5.
Bootstrap estimates of standard errors, a technique developed by Efron17 in 1979, were calculated to assess the accuracy of parameter estimates of diagnostic groups included in the final model. Bootstrapping relies on drawing a large number of independent bootstrap samples.17 Each bootstrap replicate is obtained by randomly sampling n times, with replacement, from the original database of size n. Because a bootstrap replicate is drawn with replacement, each replicate might contain multiple copies of an observation or none at all. Our bootstrap algorithm generated 150 replicates. Parameter estimates were calculated for each, and the standard deviation of these point estimates was the bootstrap estimate of standard errors. The standard errors were used to compute 95% bootstrap confidence intervals (CIs) for parameter estimates, which were exponentiated to yield 95% bootstrap CIs of odds ratios (ORs). SAS statistical software (SAS Institute Inc, Cary, NC) was used to conduct all statistical analyses in this study.
In total, 595 905 ICD-9 codes were compiled for 339 077 index hospital discharges, identified by unique Institute for Clinical Evaluative Sciences key numbers. Each hospital discharge was associated with 1 or more diagnosis codes. The minimum, maximum, and median number of ICD-9 codes for index hospital discharges were 1.0, 16.0, and 3.0, respectively. One thousand nine hundred eighty-eight deaths were identified subsequent to index hospital discharges, in which 880 died within 1 year or less and 1108 died after 1 year. Descriptive statistics for age and length of stay for index hospital discharges are presented in Table 1.
A list of the 37 diagnostic groups entered into multivariable logistic regression is presented in Table 2 along with the ICD-9 codes used to generate them. The frequency of each group was recorded during index hospital admission, along with the number of inpatients who had the diagnosis and died within 1 year of the date of hospital discharge. Table 3 presents the results of the logistic regression along with the bootstrap results. Six diagnostic groups did not enter the model based on the preset P value of .20: cerebral palsy, urinary tract infection, pleural effusion, anemia, esophagitis, and acute renal failure. Four diagnostic groups (ie, gastroenteritis, otitis media, dehydration, and asthma) that had negative associations with 1-year mortality were excluded. Hence, the final list of the pediatric comorbidity prediction model consists of 27 variables. Table 3 presents the logistic regression coefficients and associated ORs with 95% CIs. The presence of brain cancer at hospital admission was the strongest predictor of 1-year post–hospital discharge mortality (OR, 76.38 [95%CI, 53.40-109.27]), followed by diabetes insipidus (OR, 39.23 [95% CI, 20.75-74.17]).
The Hosmer-Lemeshow χ2 test was performed to examine goodness of fit. Based on the large sample size, the model was poorly calibrated (P<.001). However, the model's predictive validity was excellent with a C statistic of 0.83. A bootstrap algorithm was executed to draw 150 replicates with replacement from the database to estimate bootstrap standard errors, which in turn were used to generate 95% bootstrap CIs. Table 3 lists the adjusted ORs, in descending order, with their respective 95% bootstrap CIs.
Sicker patients, even with optimal care, are likely to experience worse outcomes than their healthier counterparts. The ability to make valid comparison and thereby draw unbiased conclusions about quality of care hinges on the availability of appropriate risk-adjustment models to account for differences in patients' characteristics. Our goal was to create a statistical model that can be generated from an administrative hospital discharge database. Although clinical data are the gold standard for risk adjustment, administrative data are less expensive to acquire and less cumbersome to use because the data are already collected and computer readable. For these reasons, it is likely that administrative data will be used more regularly in the future for risk adjustment purposes. Administrative data have been used successfully in adults to identify potential comorbidity risk factors for outcomes, such as death.18- 25 The CIHI database used in our study encompassed an entire population, the province of Ontario (with negligible care provided to residents outside the province), permitting stable estimates of rare events, such as childhood mortality.
Validity is a crucial attribute that encompasses both clinical and statistical considerations. For health care professionals to accept and ultimately use our mortality prediction rule, it must be clinically valid. Health care professionals appraise the validity of a model by examining whether it includes the major variables that they judge to be important. To reduce the model to a working number of diagnoses and not rely exclusively on statistical modeling, in our first step we used clinicians to cluster homogenous codes into independent diagnostic groups. This was performed in part recognizing that ICD-9 codes lack diagnostic criteria and many diagnostic labels are used interchangeably. The 2 clinicians involved are experienced health services researchers who have solid understanding of the health administrative databases used in this study and the use of ICD-9 and International Statistical Classification of Diseases, 10th Revision26 codes in documenting diagnoses. Instead of relying solely on their own clinical experience to prognosticate mortality, their judgments on grouping were guided by data. Each diagnostic group almost certainly has differing mortality risk. Even within each diagnosis, severity of disease affects outcome. However, the purpose of this study was to develop a model that provided the ability to adjust for risk of mortality, not to determine mortality risk for every diagnosis. Other extensively used comorbidity indexes have been constructed using similar principles. For example, the Charlson Index,27 the most commonly used comorbidity risk adjustment tool for adults, lumps all cancer into 1 group, although different cancers have strikingly different mortality rates.
One standard model-performance indicator is how well risk-adjusted predictions fit the actual outcomes. A small P value suggests poor fit. In this study, because the sample size was large, even small discrepancies between the model's prediction and observed counts were anticipated to be significant. The C statistic was calculated to evaluate the model's predictive performance. In absolute terms, the ability of our model to predict 1-year mortality was excellent, as indicated by a C statistic of 0.83. In relative terms, Wang and colleagues22 adapted the Charlson Index27 to a Medicare claims database to predict 1-year mortality and reported a C statistic of 0.72.
In theory, it would be optimal to establish a model that handles more diagnostic details as well as being thoroughly calibrated. However, a tool that is too detailed may be too laborious to use. While we might be able to produce a better-calibrated model by including more diagnostic information, it could be difficult to validate such a model with other data sets. Moreover, we could be calibrating for many rare diagnoses that have limited usefulness.
Discrimination is more important than calibration because the objective is to create a general risk-adjustment model with the intent to adjust for baseline differences in average background comorbidity. If the study population has specific high-risk comorbidities, the investigator should identify the specific and relevant covariates to be incorporated in the adjustment model. For example, when studying a specific population of children with airway obstruction who underwent 2 different modalities of tracheostomy, additional clinical comorbidities associated with tracheostomy, such as congenital airway anomalies, should be included in the risk-adjustment model. We believe it is not possible to produce a parsimonious general risk-adjustment tool that is well calibrated for all conceivable pediatric populations.
Bleeker et al28 suggested that both internal and external validations be used to ascertain performance of a new regression model. In our study, rather than using the split-half method, we used the entire data set to develop the model and performed bootstrapping as the measure ofinternal validation. Bootstrapping to look at the stability of the individual coefficients is sensitive in capturing the destabilizing effect of a few highly influential observations within a diagnostic group.
Because our model was developed using an entire population, external validation might not be as critical. However, to evaluate whether we developed a model of pediatric comorbidity that predicts well in diverse settings, retesting of the model in an independent database to evaluate the coefficients derived in an Ontario population is important. We expect that slight differences in patterns of care and coding practices could affect the type of clinical data available for risk adjustment and could potentially change the relationship between the risk factors and outcomes. Future studies should also examine whether the pediatric comorbidity model has utility in predicting other outcomes, such as length of hospital stay.
The model has several strengths. First, the model has a relatively small number of variables that can be easily generated and replicated in other administrative databases using ICD-9 codes. Second, to our knowledge, it was the first population study that used administrative data to quantify the relationship between individual pediatric comorbidities and 1-year mortality. Third, the variables in the model are clinically sensible because pediatric clinicians were involved at all stages of model building.
The study has 2 potential limitations. First, we were not able to compare the model's performance against that derived from health records. Second, the model needs to be externally validated. Nonetheless, our research offers the first Charlson-like tool for pediatrics to adjust for comorbidity using administrative data. We anticipate the pediatric comorbidity risk-adjustment model will prove to be useful to outcome researchers working with administrative data in assessing the existence and direction of confounding by comorbidity.
Correspondence: Derek Tai, MSc, University of Toronto, 133 Wild Orchid Crescent, Markham, Ontario L6C 1V6, Canada (email@example.com).
Accepted for Publication: October 13, 2005.
Funding/Support: This study was supported by the Research Institute, The Hospital for Sick Children, Toronto, Ontario.
Additional Information: Dr Wright is the RB Salter Chair of Pediatric Surgical Research at The Hospital for Sick Children.
Acknowledgment: We thank all biostatisticians at the Institute for Clinical Evaluative Sciences (North York, Ontario) for help with SAS programming.
Tai D, Dick P, To T, Wright JG. Development of Pediatric Comorbidity Prediction Model. Arch Pediatr Adolesc Med. 2006;160(3):293–299. doi:10.1001/archpedi.160.3.293