Author Affiliations: Division of General Pediatrics, Center for Pediatric Clinical Effectiveness, Children's Hospital of Philadelphia (Drs Keren, Localio, McLeod, and Dai and Mr Luan) and Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania School of Medicine (Drs Keren and Localio), Philadelphia; Children's Hospital Association, Overland Park, Kansas (Dr Hall); and Division of Inpatient Medicine, Department of Pediatrics, University of Utah Health Sciences Center, and Primary Children's Medical Center and Institute for Healthcare Delivery Research, Intermountain Healthcare, Salt Lake City (Dr Srivastava).
Objective To use information about prevalence, cost, and variation in resource utilization to prioritize comparative effectiveness research topics in hospital pediatrics.
Design Retrospective analysis of administrative and billing data for hospital encounters.
Setting Thirty-eight freestanding US children's hospitals from January 1, 2004, through December 31, 2009.
Participants Children hospitalized with conditions that accounted for either 80% of all encounters or 80% of all charges.
Main Outcome Measures Condition-specific prevalence, total standardized cost, and interhospital variation in mean standardized cost per encounter, measured in 2 ways: (1) intraclass correlation coefficient, which represents the fraction of total variation in standardized costs per encounter due to variation between hospitals; and (2) number of outlier hospitals, defined as having more than 30% of encounters with standardized costs in either the lowest or highest quintile across all encounters.
Results Among 495 conditions accounting for 80% of all charges, the 10 most expensive conditions accounted for 36% of all standardized costs. Among the 50 most prevalent and 50 most costly conditions (77 in total), 26 had intraclass correlation coefficients higher than 0.10 and 5 had intraclass correlation coefficients higher than 0.30. For 10 conditions, more than half of the hospitals met outlier hospital criteria. Surgical procedures for hypertrophy of tonsils and adenoids, otitis media, and acute appendicitis without peritonitis were high cost, were high prevalence, and displayed significant variation in interhospital cost per encounter.
Conclusions Detailed administrative and billing data can be used to standardize hospital costs and identify high-priority conditions for comparative effectiveness research—those that are high cost, are high prevalence, and demonstrate high variation in resource utilization.
Patients, health care providers, and payers need better evidence about the comparative effectiveness of medical therapies and health care delivery models. The American Recovery and Reinvestment Act of 2009 acknowledged this need and allocated hundreds of millions of dollars to the National Institutes of Health, the Agency for Healthcare Research and Quality, and the newly established Patient-Centered Outcomes Research Institute to fund comparative effectiveness research (CER).1 The Patient-Centered Outcomes Research Institute is a public-private partnership whose principal focus will be to formulate and fund a portfolio of CER projects to compare drugs, medical devices, tests, surgical procedures, or health care delivery models. To achieve the highest return on this investment, it is essential that scientific agencies and institutes such as the Patient-Centered Outcomes Research Institute have a means of prioritizing CER projects. To that end, Congress charged 2 groups—the Institute of Medicine (IOM) Committee on Comparative Effectiveness Research Prioritization2 and the Federal Coordinating Council for Comparative Effectiveness Research3—to solicit extensive public input and provide recommendations on priorities for the allocation of these new CER funds.
The Federal Coordinating Council for Comparative Effectiveness Research report focused its recommendations largely on the infrastructure needed to support a national CER effort. However, the IOM committee identified 100 specific research topics that should be prioritized for CER, one-fifth of which had relevance to children.4 The IOM recommendations were based on tables summarizing prevalence, morbidity, cost, mortality, and variations in treatment for each topic considered—important condition-related criteria that Dubois and Graff5 have incorporated into a larger framework for prioritizing CER research topics, which also includes research-related criteria such as the feasibility (cost and time), likelihood of success, and potential impact of the research.
While it is an important step toward establishing a research agenda for CER, the IOM list of priority conditions falls short of what is needed to prioritize CER in pediatrics. Most of the included pediatric priority topics were nonspecific in content and target population (eg, “care coordination programs for children and adults with chronic disease”) and all but one focused on care provided in the outpatient setting. The list of pediatric conditions and questions lacking a strong CER evidence base is vast, and many of the pediatric topics that have the greatest impact on overall childhood morbidity, mortality, and health care spending are managed in the hospital setting, which is the locus of rapidly emerging technologies and life-saving or life-extending therapies, especially for medically complex children.6 To provide a more detailed and comprehensive view of the pediatric conditions that should be prioritized for CER, focusing specifically on the inpatient setting, we used detailed administrative and billing data from 38 of the largest freestanding children's hospitals in the United States to generate inputs for several of the key condition-specific prioritization criteria included in the prioritization frameworks of the IOM committee and DuBois and Graff—prevalence, cost, and variation in care, measured in terms of the variation in resource utilization for children hospitalized with specific conditions.
This study was a retrospective cohort study. The Pediatric Health Information System (PHIS) database contains detailed hospital administrative and billing data from 43 freestanding children's hospitals affiliated with the Children's Hospital Association (formerly known as the Child Health Corporation of America). Details about the PHIS database have been reported previously.7 The Institutional Review Board of the Children's Hospital of Philadelphia deemed this study exempt from review under 45 CFR 46.102(f), as the participants were not readily identifiable.
We queried the PHIS database (January 1, 2004, through December 31, 2009) to identify the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) primary discharge diagnosis codes associated with encounters (inpatient, ambulatory surgery, and observation unit) that accounted for either 80% of all encounters or 80% of all charges.
Two of us (R.K. and R.S.) independently reviewed this list of 701 ICD-9-CM primary discharge diagnosis codes and grouped them into distinct “conditions” based on whether the initial evaluation and management of the diagnoses are the same across individual codes. Differences in group assignments were resolved by consensus. We further divided the conditions into medical, surgical, or medical/surgical based on whether fewer than 20%, more than 80%, or between 20% and 80%, respectively, of encounters for a particular condition had an ICD-9-CM principal procedure code for a surgery related to the condition. This process resulted in 255 medical, 231 surgical, and 16 medical/surgical conditions and produced more granular classification of discharge diagnoses than would have been possible using existing classification systems such as diagnosis related groups or related classification systems.
We used the ICD-9-CM procedure codes that were reported for each encounter to further restrict the cohort of children within each condition. For medical and medical/surgical conditions, we excluded children who had procedures (surgical or nonsurgical) that were unlikely to be related to the medical condition (eg, laparoscopic appendectomy for a child with a primary diagnosis of asthma). For surgical conditions, we included only children who had procedures that were likely to be related to the condition (eg, laparoscopic or open appendectomy for a child with a primary diagnosis of appendicitis). A team of 8 pediatric hospital medicine researchers decided which procedure codes would be used to include or exclude children from the condition-specific cohorts. The medical and surgical conditions were divided into 4 lists, and 2 researchers were assigned to independently review each list and identify procedure codes that should serve as inclusion or exclusion criteria for a particular condition. The 4 groups reviewed a total of 14 534 procedures codes and agreed on inclusion or exclusion of 88% of codes overall. The agreement rates for individual groups were 88%, 90%, 78%, and 96%. Disagreements were resolved through consensus.
One of our key prioritization factors was variation in the volume of resources used for a particular condition. The idea to focus on variation in resource utilization, rather than variation in adherence to specific quality measures, mirrors the approach adopted by the Dartmouth Atlas of Health Care, which has shown repeatedly that high-intensity care is not always associated with better quality or outcomes and is often a product of excess local capacity (supply-sensitive care) and/or inadequate evidence or efforts to support lower-intensity care (preference-sensitive care).8 The variation identified in Dartmouth Atlas of Health Care studies is used to signal conditions or procedures for which there may be opportunities to improve the quality and/or value derived from health care spending,9 a fundamental goal of CER. Focusing on variation in resource utilization also provided us with a single common metric to study variation across all pediatric inpatient conditions and compare the magnitude of that variation across conditions.
Because we sought to use hospitalization costs as a surrogate for volume of resources expended, we needed to standardize the cost of individual items to remove the high interhospital variation in item costs. To calculate standardized costs per item, we first tabulated the line-item charges and number of billed units for every Clinical Transaction Classification (CTC) code in every hospital billing record. We then computed the cost per CTC for each line item using hospital- and department-specific ratios of cost to charges (RCCs), the CTC charge, and the number of billed CTC units. The charges listed in the PHIS database already adjust for the wage and price index (published annually in the Federal Register). All costs were then inflated to 2009 dollars using the medical care services component of the Consumer Price Index.10 Next, we calculated the median cost for every CTC code within each hospital. Finally, we defined the standardized unit cost for each CTC code as the median of hospital median unit costs. The standardized unit costs for a total of 20 903 CTC codes were tabulated in a cost master index (CMI). Using the standardized unit costs from the CMI, we recalculated the total hospitalization cost for every admission (n = 3 482 709) by multiplying the CMI cost by the number of units for each CTC appearing in the hospital bill and then summing the standardized costs of each line item in every hospital bill.
A total of 43 hospitals contributed data to the PHIS database during the 6-year study period (January 1, 2004, through December 31, 2009). Several issues affected the final data included in the analysis.
Two hospitals were excluded from the analysis because they did not submit any billing data and 1 hospital was excluded because it submitted only 1 year of billing data, leaving a total of 40 hospitals that submitted billing data during the study period: 34 that submitted billing data for the entire 6-year study period, 4 that started submitting billing data in 2005, and 2 that started in 2006. On further review of the billing data from these hospitals, we discovered 13 quarters (distributed across individual hospitals) when 10% or more of the records were missing billing data, and we excluded these from the analysis as well.
The PHIS billing data are divided into 29 charge departments, each with a hospital- and year-specific RCC that we used to convert charges into costs. Two hospitals submitted no RCC information and were excluded from the analysis, leaving us with a total of 38 hospitals and 221 hospital-years' worth of data for the calculation of standardized costs for the CMI and the analysis of interhospital variation in costs for individual conditions. We expected 6409 department-specific RCCs (29 department-specific RCCs × 221 hospital-years), but 454 (7%) were missing. We used a formal imputation strategy (chained equations) to impute these missing department-specific RCCs.11
Based on manual review, we determined that noninteger units or unit counts that were unreasonably high or low for CTC codes in some bills reflected hospital-specific coding errors or idiosyncrasies. We simply replaced the line-item cost calculated by multiplying the number of units by CMI unit cost with the reported charge (multiplied by the RCC) when the suspect cost was either more than 3 times or less than one-third of the replacement cost. This substitution was necessary in fewer than 5% of all the billed items.
We calculated the standardized costs for each encounter of a child with one of the conditions (medical, surgical, and medical/surgical) included in our ICD-9-CM grouper. For each condition, we evaluated the overall distribution of costs per encounter. To reduce the influence of costs for children with complex chronic conditions, who often use more resources for reasons unrelated to their reason for admission, we identified and excluded extreme cost outliers (top 1% of charges within each condition). For some analyses, we adjusted the standardized costs for patient age (<30 days; ≥30 days and <1 year; ≥1 year and <5 years; ≥5 years and <13 years; ≥13 years and <17 years; ≥18 years), sex, race (white, black, other), presence of a complex chronic condition,12 all patient refined diagnosis related group severity level, and patient type (inpatient, ambulatory surgery, and observation status).
For each condition, we calculated the mean, median, and quintiles of standardized (as well as adjusted standardized) cost per encounter. We generated bin plots to show the number and proportion of patients at a particular hospital whose standardized (and adjusted standardized) cost per encounter was within 1 of the 5 quintiles. We also generated box plots to demonstrate the within- and across-hospital variation in mean standardized cost per encounter for each condition. For a simple and easily interpretable estimation of the degree of variation in standardized cost per encounter across hospitals, for each condition we counted the number of hospitals with more than 30% of their encounters in either the highest or lowest quintile of overall standardized cost per encounter (defined as outlier hospitals). As another measure of variation, we estimated for each condition the intraclass correlation coefficient (ICC), which in the context of our analysis equals the amount of variation in outcome (costs) across hospitals as a fraction of total variation in costs, where the total variation includes within-hospital and across-hospital dispersion of costs.13 As such, it presents a natural measure: the ICC approaches 0 if variation across hospitals is small, and the ICC approaches 1 as hospitals begin to account for all variation of costs. For different clinical conditions, the ICC therefore measures across-hospital variation on a common metric. The ICCs were calculated using a mixed-effects model with hospital as a random intercept and the patient characteristics listed earlier as fixed effects. All data management and analyses were conducted using SAS version 9.2 statistical software (SAS Institute, Inc) and Stata version 11.1 statistical software (StataCorp LP).
The Table shows the 50 most prevalent and 50 most costly conditions (total of 77 altogether) sorted by cumulative standardized cost across hospitals. Conditions ranked high in cumulative cost because they were either very prevalent (eg, pneumonia) or very expensive on a cost per encounter basis (eg, respiratory distress syndrome in the newborn). The 10 most expensive conditions accounted for 36% of all costs among the 495 conditions included in our original sample.
Of the 77 most prevalent and/or costly conditions, 26 had ICCs higher than 0.10 and 5 had ICCs higher than 0.30 after adjusting costs for patient demographic characteristics, presence of a complex chronic condition, all patient refined diagnosis related group severity level, and admission type (Table). Our outlier analysis identified many conditions for which a large number of hospitals had a high proportion of high- or low-cost hospitalizations, including 10 conditions for which more than half of the hospitals had more than 30% of encounters with costs in either the lowest or highest quintile of overall costs. Conditions that met all the prioritization criteria (prevalent, high cost, and high variation in interhospital cost per encounter, even after adjusting for patient-level factors) included hypertrophy of the tonsils and adenoids requiring tonsillectomy and/or adenoidectomy, otitis media requiring tympanostomy tube placement, and acute appendicitis without peritonitis requiring appendectomy.
Looking more closely at acute appendicitis without peritonitis as an example of a condition that meets multiple high-priority criteria, we can see that it was indeed common (n = 40 142 over 6 years; prevalence rank = 16), was costly (approximately $11.5 billion over 6 years; standardized cost rank = 24), and had a high degree of interhospital variability in standardized costs per encounter (Figure 1). The ICC for this condition was 0.19, which means that 19% of all the variation in standardized cost per encounter could be attributed solely to the hospital in which these children happened to receive care. The outlier analysis for acute appendicitis without peritonitis (Figure 2) also demonstrated a high degree of interhospital variation in standardized costs. There was a 1.7-fold difference in mean cost per encounter between the 20th and 80th percentiles, and a total of 15 hospitals met outlier criteria—7 with more than 30% of encounters in the lowest cost quintile and 8 with more than 30% of encounters in the highest cost quintile.
Figure 1. Within- and across-hospital standardized costs per encounter for appendicitis without peritonitis (all included admissions had a procedure code for appendectomy). Boxes indicate interquartile range (25th-75th percentiles); center hatches, median (50th percentile); diamonds, mean; whiskers, minimum and maximum values within a range defined by the 25th percentile minus 1.5 (interquartile range) and the 75th percentile plus 1.5 (interquartile range); and circles, values outside this range.
Figure 2. Distribution of hospitals' standardized costs per encounter for appendicitis without peritonitis (all included encounters had a procedure code for appendectomy) according to overall quintiles of standardized costs. Overall quintiles are defined by standardized cost per encounter for all patients with appendicitis without peritonitis across all hospitals. Hospitals toward the top had a higher proportion of encounters with standardized costs in the lowest quintiles, while hospitals toward the bottom had a higher proportion of encounters with standardized costs in the highest quintiles. For example, hospital 7 had approximately 56% of encounters with standardized costs in the fifth quintile and approximately 20% of encounters with standardized costs in the fourth quintile.
Using detailed billing and administrative data from a consortium of 38 large freestanding children's hospitals, we generated inputs for several of the important condition-related criteria that both the IOM2 and Dubois and Graff5 included in their CER prioritization frameworks. The approach required development of a grouper for combining encounters with similar principal discharge diagnosis and procedure codes and the creation of a CMI to standardize the unit costs for more than 22 000 CTC codes in the PHIS database. We identified several surgical conditions that were prevalent and/or costly and displayed high variation in standardized cost per encounter even after adjustment for patient demographic characteristics and markers of severity of illness. Ten of our 77 prevalent and/or costly conditions appeared on the IOM's list of priority topics, including prematurity, dental caries, and asthma, but these conditions did not always display high interhospital variability in resource utilization. Although our analysis was limited to pediatric encounters, researchers can apply our methods and prioritization strategy to detailed administrative data collected in other care settings and adults. Both researchers and funding agencies can use the results of our analysis to decide which pediatric conditions should be prioritized for CER.
While differences in patient characteristics such as severity of illness, comorbidities, or disease stage at presentation might account for some of the high variation in resource utilization that we observed for specific conditions, we know from previous studies done using the PHIS database (for example, on osteomyelitis,14 complicated pneumonia,15 urinary tract infections,16 and antireflux surgery17) that hospital-specific differences in approaches to clinical management are important drivers of variation in resource utilization. Decisions about length of stay, location of care, use of specific medications, surgical vs medical approaches, and early vs late intervention often have major implications for resource utilization and patient outcomes. As others have found in hospital care for adults, some of this variation will reflect overuse of health care resources without clear benefits to patients, while some will reflect underuse of resources resulting in suboptimal patient outcomes. The implications of this unwarranted variation are 2-fold. In situations where a strong evidence base exists to support one management strategy over another, the variation represents an opportunity to standardize care across hospitals through quality improvement collaboratives, similar to those espoused by the Institute for Healthcare Improvement and Intermountain Healthcare.18 However, more often (and especially in pediatrics), this variation is a symptom of major evidence gaps regarding best practices and thus signals a need for more CER.
Our study has a few important limitations. First, our data and analyses were limited to discrete hospital encounters and did not include longitudinal costs of outpatient care or rehospitalization. Some of the prioritization rankings might have been quite different if we had incorporated longitudinal costs. Second, the hospitalization costs that we report do not reflect the true costs of providing care at each of the hospitals. We standardized unit costs and used total standardized costs as a surrogate measure for summarizing and comparing resource utilization across hospitals. Third, the PHIS database includes only freestanding tertiary care children's hospitals. It is possible that variation in costs across hospitals would have been even greater if smaller, general, and community hospitals had been included, where the intensity of care is often lower.
How can these results be used to inform prioritization of CER topics? First, our methods can be applied to other administrative data sets to generate inputs for prioritization in adult medical care and in the outpatient setting. Second, our estimates of prevalence, cost, and variability in resource utilization for pediatric hospital conditions can be used as inputs for funding agencies such as the Patient-Centered Outcomes Research Institute in their prioritization of pediatric conditions for CER. It must be recognized, of course, that other condition-related criteria need to be considered in any prioritization effort, including variation in outcomes and the level of current evidence for any specific condition. As outlined in Dubois and Graff's prioritization framework, there are also research-related criteria that must be considered, including the cost and time required to complete the research, the probability of research success, and the likelihood that the research findings would be adopted into practice. After high-priority conditions are selected, we must identify the specific CER questions related to those conditions most in need of answers.
Comparative effectiveness research aims to produce generalizable knowledge about the benefits and harms of alternative methods to prevent, diagnose, treat, and monitor a clinical condition or to improve the delivery of care. Pediatric populations have special potential to benefit from this type of research as there is a paucity of evidence to support the safety and efficacy of medications, procedures, and medical devices in children.4 The Patient-Centered Outcomes Research Institute and other organizations can use our methods and the results of our pediatric-focused analysis to generate inputs for the prioritization frameworks they will use to assemble their CER portfolios.
Correspondence: Ron Keren, MD, MPH, Division of General Pediatrics, Children's Hospital of Philadelphia, 3535 Market St, Room 1524, Philadelphia, PA 19104 (firstname.lastname@example.org).
Accepted for Publication: May 3, 2012.
Published Online: October 1, 2012. doi:10.1001/archpediatrics.2012.1266
Author Contributions: All authors had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Keren, Luan, Localio, Hall, McLeod, and Srivastava. Acquisition of data: Keren, Luan, Hall, and Dai. Analysis and interpretation of data: Keren, Luan, Localio, Hall, McLeod, and Dai. Drafting of the manuscript: Keren, Localio, and Hall. Critical revision of the manuscript for important intellectual content: Keren, Luan, Localio, McLeod, Dai, and Srivastava. Statistical analysis: Keren, Luan, Localio, Hall, and Dai. Obtained funding: Keren and Srivastava. Administrative, technical, and material support: Keren, Luan, and McLeod. Study supervision: Keren, Luan, Localio, and Srivastava.
Pediatric Research in Inpatient Settings (PRIS) Network Investigators: Sanjay Mahant, MD, MSc, Patrick Conway, MD, MSCE, Christopher Landrigan, MD, MPH, Tamara Simon, MD, MSPH, Samir Shah, MD, MSCE, Karen Wilson, MD, MPH, and Joel Tieder, MD, MPH.
Financial Disclosure: Dr Keren has served as a paid member of the Data Safety and Monitoring Board for Tengion Inc and has provided consultation in medical malpractice cases. Dr Localio is an associate editor of the Annals of Internal Medicine, which receives and considers similar manuscripts regularly. Dr Srivastava has provided consultation in medical malpractice cases.
Funding/Support: This work was supported by Health Research Formula Grant 4100050891 from the Pennsylvania Department of Public Health Commonwealth Universal Research Enhancement Program and a grant from the Child Health Corporation of America.
Additional Contributions: Debra Hillman, MS, and Jaime Blank, MSHS, provided overall project management for this research effort. The Pediatric Research in Inpatient Settings (PRIS) Executive Council (Sanjay Mahant, MD, MSc, Patrick Conway, MD, MSCE, Christopher Landrigan, MD, MPH, Tamara Simon, MD, MSPH, Samir Shah, MD, MSCE, Karen Wilson, MD, MPH, and Joel Tieder, MD, MPH) reviewed procedure codes to further restrict the cohort of children within the medical and surgical conditions.
Keren R, Luan X, Localio R, Hall M, McLeod L, Dai D, Srivastava R, Pediatric Research in Inpatient Settings (PRIS) Network FT. Prioritization of Comparative Effectiveness Research Topics in Hospital Pediatrics. Arch Pediatr Adolesc Med. 2012;166(12):1155-1164. doi:10.1001/archpediatrics.2012.1266