[Skip to Content]
Access to paid content on this site is currently suspended due to excessive activity being detected from your IP address 35.175.191.168. Please contact the publisher to request reinstatement.
[Skip to Content Landing]
Figure 1.
A Description of Template Matching
A Description of Template Matching

Each geometric shape represents a different type of patient (eg, patient with asthma admitted with different physiologic severity), and the varying sizes of each shape represent different characteristics of that patient type (eg, different ages). A template is formed from the national sample and each hospital is matched to the template using a 1:1, 2:1, or 3:1 ratio, depending on the available sample size at that hospital.

Figure 2.
Cost by Length of Stay (LOS) by Intensive Care Unit (ICU) Utilization
Cost by Length of Stay (LOS) by Intensive Care Unit (ICU) Utilization

The Spearman correlation coefficient r between median cost (A) and trimmed LOS (B) was highly significant (r = 0.57; P < .001), the correlation between ICU utilization (C) and trimmed LOS was significant (r = 0.41, P = .01), and the correlation between median cost and ICU utilization was not significant (r = 0.03, P = .87). The straight solid lines within the scatterplots were constructed using robust linear model m-estimation.37,38,40

Figure 3.
In-Hospital Cost by Patient Risk Level
In-Hospital Cost by Patient Risk Level

The x-axis of each graph represents the risk, estimated by predicted length of stay, for each template patient strata. The y-axis represents the difference in cost (focal  minus control) inside each matched pair. A point falling on the horizontal line at 0 represents no difference in cost between the 2 patients in the matched pair, and a point falling below the line suggests a lower cost for the focal vs control patient. The solid lines represent the locally weighted scatterplot smoothing (LOWESS) line.41 LOWESS 95% CI bands (shaded areas) for the central tendency line were produced using the bootstrap method. A box plot at the bottom of each graph denotes the 5%, 25%, 50%, 75%, and 95% values of predicted risk over all strata. Each graph illustrates an individual hospital.

Table 1.  
Comparison of Hospital Template–Matched Samples With Random Hospital Samplesa
Comparison of Hospital Template–Matched Samples With Random Hospital Samplesa
Table 2.  
Practice Style Results Across 37 PHIS Hospitalsa
Practice Style Results Across 37 PHIS Hospitalsa
1.
Friedman  B, Berdahl  T, Simpson  LA,  et al.  Annual report on health care for children and youth in the United States: focus on trends in hospital use and quality.  Acad Pediatr. 2011;11(4):263-279.PubMedGoogle ScholarCrossref
2.
Kamble  S, Bharmal  M.  Incremental direct expenditure of treating asthma in the United States.  J Asthma. 2009;46(1):73-80.PubMedGoogle ScholarCrossref
3.
Chen  KH, Chen  CC, Liu  HE, Tzeng  PC, Glasziou  PP.  Effectiveness of paediatric asthma clinical pathways: a narrative systematic review.  J Asthma. 2014;51(5):480-492.PubMedGoogle ScholarCrossref
4.
Shanley  LA, Lin  H, Flores  G.  Factors associated with length of stay for pediatric asthma hospitalizations.  J Asthma. 2015;52(5):471-477.PubMedGoogle ScholarCrossref
5.
Bratton  SL, Newth  CJ, Zuppa  AF,  et al; Eunice Kennedy Shriver National Institute of Child Health and Human Development Collaborative Pediatric Critical Care Research Network.  Critical care for pediatric asthma: wide care variability and challenges for study.  Pediatr Crit Care Med. 2012;13(4):407-414.PubMedGoogle ScholarCrossref
6.
Parikh  K, Hall  M, Mittal  V,  et al.  Establishing benchmarks for the hospitalized care of children with asthma, bronchiolitis, and pneumonia.  Pediatrics. 2014;134(3):555-562.PubMedGoogle ScholarCrossref
7.
Morse  RB, Hall  M, Fieldston  ES,  et al.  Hospital-level compliance with asthma care quality measures at children’s hospitals and subsequent asthma-related outcomes.  JAMA. 2011;306(13):1454-1460.PubMedGoogle ScholarCrossref
8.
Chamberlain  JM, Teach  SJ, Hayes  KL, Badolato  G, Goyal  MK.  Practice pattern variation in the care of children with acute asthma.  Acad Emerg Med. 2016;23(2):166-170.PubMedGoogle ScholarCrossref
9.
Silber  JH, Rosenbaum  PR, Ross  RN,  et al.  Template matching for auditing hospital cost and quality.  Health Serv Res. 2014;49(5):1446-1474.PubMedGoogle ScholarCrossref
10.
Silber  JH, Rosenbaum  PR, Ross  RN,  et al.  A hospital-specific template for benchmarking its cost and quality.  Health Serv Res. 2014;49(5):1475-1497.PubMedGoogle ScholarCrossref
11.
Silber  JH, Rosenbaum  PR, Trudeau  ME,  et al.  Multivariate matching and bias reduction in the surgical outcomes study.  Med Care. 2001;39(10):1048-1064.PubMedGoogle ScholarCrossref
12.
Rosenbaum  PR.  Part II: Matching: Design of Observational Studies. New York, NY: Springer; 2010:153-253.
13.
Silber  JH, Rosenbaum  PR, Ross  RN,  et al.  Indirect standardization matching: assessing specific advantage and risk synergy [published online February 29, 2016].  Health Serv Res.PubMedGoogle Scholar
14.
National Heart, Lung, and Blood Institute, National Asthma Education and Prevention Program. Expert Panel Report 3: guidelines for the diagnosis and management of asthma: full report, 2007. NIH Publication No. 07-4051. Bethesda, MD: US Dept of Health and Human Services. http://www.nhlbi.nih.gov/files/docs/guidelines/asthgdln.pdf. Published August 28, 2007. Accessed July 22, 2015.
15.
Rosenbaum  PR. Basic tools of multivariate matching. section 8.3: distance matrices. In:  Design of Observational Studies. New York, NY: Springer; 2010:168-172.
16.
Rubin  DB.  Bias reduction using Mahalanobis metric matching.  Biometrics. 1980;36(2):293-298.Google ScholarCrossref
17.
R Development Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org. Accessed April 15, 2014.
18.
Zubizarreta  JR.  Using mixed integer programming for matching in an observational study of kidney failure after surgery.  J Am Stat Assoc. 2012;107(500):1360-1371.Google ScholarCrossref
19.
Zubizarreta  JR, Cerdá  M, Rosenbaum  PR.  Effect of the 2010 Chilean earthquake on posttraumatic stress: reducing sensitivity to unmeasured bias through study design.  Epidemiology. 2013;24(1):79-87.PubMedGoogle ScholarCrossref
20.
Silber  JH, Rosenbaum  PR, McHugh  MD,  et al.  Comparing the value of better nursing work environments across different levels of patient risk [published online January 20, 2016].  JAMA Surg.PubMedGoogle Scholar
21.
Rosenbaum  PR.  Fine Balance: Design of Observational Studies. New York, NY: Springer; 2010:197-206.
22.
Rosenbaum  PR, Ross  RN, Silber  JH.  Minimum distance matched sampling with fine balance in an observational study of treatment for ovarian cancer.  J Am Stat Assoc. 2007;102(477):75-83.Google ScholarCrossref
23.
Silber  JH, Rosenbaum  PR, Polsky  D,  et al.  Does ovarian cancer treatment and survival differ by the specialty providing chemotherapy?  J Clin Oncol. 2007;25(10):1169-1175.PubMedGoogle ScholarCrossref
24.
Silber  JH, Rosenbaum  PR, Kelz  RR,  et al.  Medical and financial risks associated with surgery in the elderly obese.  Ann Surg. 2012;256(1):79-86.PubMedGoogle ScholarCrossref
25.
Yang  D, Small  DS, Silber  JH, Rosenbaum  PR.  Optimal matching with minimal deviation from fine balance in a study of obesity and surgical outcomes.  Biometrics. 2012;68(2):628-636.PubMedGoogle ScholarCrossref
26.
Kruskal  W, Wallis  WA.  Use of ranks in one-criterion variance analysis.  J Am Stat Assoc. 1952;47(260):583-621.Google ScholarCrossref
27.
Rubin  DB.  The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials.  Stat Med. 2007;26(1):20-36.PubMedGoogle ScholarCrossref
28.
Rubin  DB.  For objective causal inference, design trumps analysis.  Ann Appl Stat. 2008;2(3):808-840.Google ScholarCrossref
29.
Simes  RJ.  An improved Bonferroni procedure for multiple tests of significance.  Biometrika. 1986;73(3):751-754.Google ScholarCrossref
30.
Benjamini  Y, Hochberg  Y.  Controlling the false discovery rate: A practical and powerful approach to multiple testing.  J R Stat Soc B. 1995;57(1):289-300.Google Scholar
31.
Keren  R, Luan  X, Localio  R,  et al; Pediatric Research in Inpatient Settings (PRIS) Network.  Prioritization of comparative effectiveness research topics in hospital pediatrics.  Arch Pediatr Adolesc Med. 2012;166(12):1155-1164.PubMedGoogle ScholarCrossref
32.
Bureau of Labor Statistics. Consumer Price Index. http://www.bls.gov/cpi/home.htm. Accessed March 10, 2014.
33.
Rosenbaum  PR.  Reduced sensitivity to hidden bias at upper quantiles in observational studies with dilated treatment effects.  Biometrics. 1999;55(2):560-564.PubMedGoogle ScholarCrossref
34.
Rosenbaum  PR. Models for Treatment Effects. 5.3: Dilated Effects. In:  Observational Studies. 2nd ed. New York, NY: Springer-Verlag; 2002:173-179.
35.
Bishop  YMM, Fienberg  SE, Holland  PW. Analysis of Square Tables: Symmetry and Marginal Homogeneity. In:  Discrete Multivariate Analysis: Theory and Practice. Cambridge, MA: MIT Press; 1975:281-286.
36.
Mantel  N.  Chi-square tests with one degree of freedom: extensions of the Mantel-Haenszel procedure.  J Am Stat Assoc. 1963;58(303):690-700.Google Scholar
37.
Hampel  FR, Ronchett  EM, Rousseeuw  PJ,  et al. Linear models: robust estimation. In:  Robust Statistics: The Approach Based on Influence Functions. New York, NY: John Wiley & Sons; 1986:307-341.
38.
Huber  PJ.  Robust Statistics. Hoboken, NJ: John Wiley & Sons; 1981.
39.
Tukey  JW. Schematic summaries (pictures and numbers. In:  Eighths, Sixteenths, etc: Exploratory Data Analysis. Reading, MA: Addison-Wesley Publishing Co Inc; 1977:53-54.
40.
Ripley  B, Bates  DM, Hornik  K,  et al. Package “MASS.” https://cran.r-project.org/web/packages/MASS/MASS.pdf. Updated April 22, 2016. Accessed March 17, 2016.
41.
Cleveland  WS.  Robust locally weighted regression and smoothing scatterplots.  J Am Stat Assoc. 1979;74(368):829-836.Google ScholarCrossref
Original Investigation
September 2016

Auditing Practice Style Variation in Pediatric Inpatient Asthma Care

Author Affiliations
  • 1Center for Outcomes Research, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania
  • 2Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia
  • 3Department of Anesthesiology and Critical Care, Perelman School of Medicine, University of Pennsylvania, Philadelphia
  • 4Department of Health Care Management, Wharton School, University of Pennsylvania, Philadelphia
  • 5Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia
  • 6Department of Statistics, Wharton School, University of Pennsylvania, Philadelphia
  • 7Division of General Pediatrics, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania
  • 8Division of Emergency Medicine, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania
JAMA Pediatr. 2016;170(9):878-886. doi:10.1001/jamapediatrics.2016.0911
Abstract

Importance  Asthma is the most prevalent chronic illness among children, remaining a leading cause of pediatric hospitalizations and representing a major financial burden to many health care systems.

Objective  To implement a new auditing process examining whether differences in hospital practice style may be associated with potential resource savings or inefficiencies in treating pediatric asthma admissions.

Design, Setting, and Participants  A retrospective matched-cohort design study, matched for asthma severity, compared practice patterns for patients admitted to Children’s Hospital Association hospitals contributing data to the Pediatric Hospital Information System (PHIS) database. With 3 years of PHIS data on 48 887 children, an asthma template was constructed consisting of representative children hospitalized for asthma between April 1, 2011, and March 31, 2014. The template was matched with either a 1:1, 2:1, or 3:1 ratio at each of 37 tertiary care children’s hospitals, depending on available sample size.

Exposure  Treatment at each PHIS hospital.

Main Outcomess and Measures  Cost, length of stay, and intensive care unit (ICU) utilization.

Results  After matching patients (n = 9100; mean [SD] age, 7.1 [3.6] years; 3418 [37.6%] females) to the template (n = 100, mean [SD] age, 7.2 [3.7] years; 37 [37.0%] females), there was no significant difference in observable patient characteristics at the 37 hospitals meeting the matching criteria. Despite similar characteristics of the patients, we observed large and significant variation in use of the ICUs as well as in length of stay and cost. For the same template-matched populations, comparing utilization between the 12.5th percentile (lower eighth) and 87.5th percentile (upper eighth) of hospitals, median cost varied by 87% ($3157 vs $5912 per patient; P < .001); total hospital length of stay varied by 47% (1.5 vs 2.2 days; P < .001); and ICU utilization was 254% higher (6.5% vs 23.0%; P < .001). Furthermore, the patterns of resource utilization by patient risk differed significantly across hospitals. For example, as patient risk increased one hospital displayed significantly increasing costs compared with their matched controls (comparative cost difference: lowest risk, −34.21%; highest risk, 53.27%; P < .001). In contrast, another hospital displayed significantly decreasing costs relative to their matched controls as patient risk increased (comparative cost difference: lowest risk, −10.12%; highest risk, −16.85%; P = .01).

Conclusions and Relevance  For children with asthma who had similar characteristics, we observed different hospital resource utilization; some values differed greatly, with important differences by initial patient risk. Through the template matching audit, hospitals and stakeholders can better understand where this excess variation occurs and can help to pinpoint practice styles that should be emulated or avoided.

Introduction

Asthma is the most prevalent chronic childhood illness and remains a leading cause of hospitalizations in children aged 1 to 15 years in the United States.1 Inpatient and emergency department treatment accounts for approximately one-third of all pediatric asthma-related health care costs.2 Since admissions for asthma are common and there are well-established clinical treatment pathways,3 one would expect similar best practices leading to equivalent use of resources, especially among hospitals that treat only children. However, considerable variation in resource utilization has been observed.4-6 In part, this variation may be owing to many potential choices in the management of inpatient asthma that reflect a myriad of clinical decisions, such as the use of diagnostic tests, bronchodilators, rescue inhalers, and inhaled and systemic corticosteroids, and in part it may be owing to differences in patient characteristics.7,8

We questioned whether resource utilization varies among hospitals caring for similar patients. To accomplish this, we used a new auditing methodology termed template matching9,10 to select a group of closely matched patients across hospitals. Template matching compares hospitals by selecting a reference template of patients and “stamping out copies” of template patients at each hospital through the use of multivariate matching.9-13 Determining whether resource utilization differs, even when examining similar patients, should help inform individual hospitals as to whether they may need to change their approach to care, and using a matching framework rather than regression modeling allows hospitals to identify specific groups of patients for whom care may be in need of change.

Box Section Ref ID

Key Points

  • Question How much does practice style vary in the management of asthma admissions across children’s hospitals?

  • Findings A template-matched audit identified large, significant variation across hospitals. For the same template-matched patients, comparing practice style between the lower and upper eighth of hospitals, median cost varied by 87%, length of stay by 47%, and intensive care unit utilization by 254%.

  • Meaning Through the audit, hospitals can better understand which patient types are associated with increased or decreased resource utilization compared with other hospitals matched to the template, therefore helping to identify practice styles that should be emulated or avoided.

Methods
Patient Population and Definitions

Data for this study were obtained from the Pediatric Hospital Information System (PHIS), an administrative database that contains inpatient, emergency department, ambulatory surgery, and observational data from 41 not-for-profit, tertiary care pediatric hospitals in the United States. These hospitals, all members of the Children’s Hospital Association (Overland Park, Kansas), represent some of the most technologically sophisticated facilities in the country. This study was approved by the institutional review board of The Children’s Hospital of Philadelphia; the need for informed consent was waived.

All nontransfer inpatient and observational unit nonresearch discharges for asthma occurring between April 1, 2011, and March 31, 2014, were considered if the patient was between ages 3 and 18 years. Asthma was defined using specific International Classification of Diseases, Ninth Revision, codes as reported in eTable 1 in the Supplement. Variables we matched on included age in days; sex; Medicaid status; common chronic conditions; asthma-affecting diagnoses; National Heart, Lung, and Blood Institute diagnoses of concern14; predicted length of stay (a risk score); a propensity score to be in the template; and asthma severity at admission. We included only the first asthma admission in the data set for each patient.

Constructing the Template

We constructed 500 templates, each consisting of 100 patients with asthma randomly sampled from the entire PHIS database. Of these, we selected 1 template with the smallest Mahalanobis distance15,16 to the mean of the entire database. Figure 1 describes the process of template creation and matching with the template. A description of the template is found in eTable 2 and the distance algorithm is described in the eMethods in the Supplement.

Statistical Analysis
Matching

We desired a minimum of a 1.5:1 pool of patients at each hospital to be used to select matches to the template. This ratio would help to produce good quality matches. As shown in Figure 1, whenever possible we sought 300 patients matched to the template (using a 3:1 matching ratio). When this was not possible, we used a 2:1 ratio selecting 200 matched controls. When this was not possible, we used a 1:1 matching ratio producing a matched sample of 100 controls. All reported statistics account for these different matching ratios. The shared template of 100 patients permits hospitals of varied size to contribute as many patients as they can to the comparison of different hospitals.

We performed our matches using the R package MIPMatch.17-19 We required exact matches on asthma severity status (moderate, severe, or critical). For other variables we chose a balanced match that minimized the medical distance9,10 to template patients, defined using the Mahalanobis distance.15,16Medical distance indicates the level of difference between 2 patients in terms of medical covariates such as age, chronic illnesses, and presentation severity (eMethods in the Supplement).9,10,13,20

To improve the quality of the matches between the template and the specific hospital, we used “near fine balance”21-25; this method ensured that if the template had, for example, a 15% rate of upper respiratory tract infections for its 100 cases, each hospital provided a 15% rate of upper respiratory tract infections for its matched controls whenever possible, without requiring exact matches on that diagnosis for all patients in the hospital with respect to the template patient. A mean constraint was introduced on age at admission and a propensity score for being in the template. We also added a penalty value to the Mahalanobis distance for differences in predicted length of stay.

We examined the degree of similarity among the hospital-matched samples and tested the overall differences in patient characteristics and practice style variables across hospitals using the Kruskal-Wallis test, a nonparametric version of the 1-way analysis of variance test26 for each continuous variable of interest, and the Pearson χ2 test for each binary variable in a 2 × k table (with k indicating number of hospitals). Values of χ2/df below 1 indicate better balance than expected by random assignment, while values of χ2/df above 1 indicate worse balance than expected. These tests compare the balance achieved by matching with the balance that would have been expected had patients been randomly assigned to hospitals. When ranks of hospitals were tied, the mean rank is presented.

All matching was completed without knowledge of outcomes (in this study, the practice style variables), as suggested by Rubin.27,28 Unlike modeling, matching without knowledge of outcomes prevents researchers from selecting the most attractive of multiple analyses.

Ideally, for a fair comparison, every hospital would have treated the same 300 patients. Of course this is not possible; however, we can evaluate the fairness of each hospital comparison by determining whether the group of matched patients in each facility can be distinguished from the patients in the template or, alternatively, whether these 2 groups of patients appear to be a random split of 1 group of patients. We excluded hospitals that could not be matched closely, which is illustrated in eFigure 1 in the Supplement. We compared 16 baseline attributes by applying the Fisher exact test 16 times using the Simes29 method in an effort to control the false discovery rate in the 16 tests at a significance level of P = .05, as suggested by Benjamini and Hochberg.30

Practice Style Variables

Once hospital matches were complete and of good quality, the facilities were compared on the following primary variables: total cost, length of stay, and the percentage of patients admitted to the intensive care unit (ICU), which was determined using PHIS service line codes. For this study, we defined ICU care using the PHIS service line codes for the pediatric ICU; ICU, unspecified; pulmonary care unit; and pediatric pulmonary care unit. The unit costs for each billing code were determined using a method similar to that of Keren et al,31 with modifications as described in detail in the eMethods in the Supplement, and were year specific. Each hospital’s costs were based on its resource use. To compare resource use not influenced by local charges, each billing code (the basis of counting the resources) was assigned a dollar value by applying a uniform formula across all hospitals. Costs were adjusted to 2014 prices using Bureau of Labor Statistics Consumer Price Index values for medical care items.32

Testing for Specific Advantage and Risk Synergy

We first test whether each matched sample at a hospital differed significantly from the matched controls at the other hospitals. For continuous outcomes, we used quantile tests33,34that determined whether each patient exceeded his or her own hospital’s median or 90th percentile value and then, in effect, used the Mantel-Haenszel statistic34,35 to test the equality of each hospital with the others in exceeding this value. To adjust for multiple testing, we determined whether the P value met the criteria for the Bonferroni correction using the cutoff of P < .05/k, with k indicating the number of tests (in this case, 37).

Specific advantage was defined as observing better patient outcomes in a specific or focal hospital rather than those of matched control patients from other facilities within the same matched set.13Risk synergy describes a situation during which, as admission risk increases or decreases, the specific advantage changes in a systematic way.13,20 For example, as admission risk increases, the focal hospital’s patients may have increasingly better outcomes than matched controls from other hospitals. When studying the specific advantage and risk synergy of the focal hospital compared with the matched control population, we tested for an interaction between patient-predicted risk strata and the main effect of the focal hospital. For discrete variables, such as use of the ICU, we tested synergy using a Mantel test for trend,36 and continuous variables were tested with robust regression,37,38 evaluating the interaction between an indicator for admission to the focal hospital and a linear term for average patient risk in the matched set (while also adjusting for the hospital main effect). Each of the patients within each stratum was assigned a risk defined by the mean predicted length of stay in the stratum as defined by the external predicted length of stay model used for matching (eTable 3 in the Supplement). If the Mantel test found a significant trend in a discrete variable or robust regression identified a significant interaction between escalating patient-predicted risk and the hospital’s performance on a continuous variable compared with matched controls despite controlling for the hospital’s main effect, the hospital was considered to demonstrate risk synergy for that variable.

Results

Of 41 hospitals in the PHIS, 40 institutions were available with complete patient records for the full study period, with 859 997 patients in the PHIS data set meeting reporting requirements and therefore available for analysis; of these, there were 64 466 patients (7.5%) admitted with an asthma diagnosis. After excluding patients transferred from other hospitals, there were 48 903 patients. Further exclusions for illogical departmental billing costs yielded a final sample of 48 887 patients eligible for study.

Quality of the Matches

All 40 hospitals met the minimum volume requirement for matching (>150 patients), and exact matching on asthma severity category was possible for 37 facilities, all of which also passed the match quality criteria. There were 24 hospitals with a 3:1 match, 6 hospitals with a 2:1 match, and 7 hospitals with a 1:1 match. Table 1 describes the 37 hospitals in the data set that were successfully matched to the template and available for analysis. After matching (n = 9100 patients; mean [SD] age, 7.1 [3.6] years; 3418 [37.6% females) to the template (n = 100 patients; mean [SD] age, 7.2 [3.7] years; 37 [37.0%] females), there was no significant difference in observable patient characteristics at the 37 hospitals meeting our the matching criteria.

Patients appeared to be different before matching but similar after matching in variables controlled by matching, as noted by the significant variation in patient characteristics across hospitals observed in the random sample before matching.

Table 1 also lists significant differences in practice style variables across hospitals. For the template-matched population, comparing utilization, median cost was 87% higher ($3157 vs $5912 per patient; P < .001) between the lower eighth39 (12.5th percentile) and upper eighth (87.5th percentile) of hospitals. Total hospital length of stay was 47% higher (1.5 vs 2.2 days; P < .001) between the 12.5th and 87.5th percentiles, and ICU utilization was 254% higher (6.5% vs 23.0%; P < .001).

Variation in Practice Style Across Hospitals

Table 2 presents the variation in practice style across hospitals, ranking on median and 90th percentile cost, 90th percentile length of stay, and ICU utilization rate (eTables 4-6 in the Supplement include details). There was minimal variation in each hospital’s median length of stay (all were either 1 or 2 days). However, there was significant variation in each hospital’s 90th percentile for length of stay and ICU utilization rate. Although some hospitals, such as hospital JJ, performed relatively poorly across all practice style variables and some did consistently well, such as hospitals C and N, others displayed inconsistent results. Hospital CC was expensive with very long lengths of stay but was below average with respect to the percentage of patients using the ICU. Similarly, the most expensive hospital (KK) also was below average in ICU utilization.

Across all 37 hospitals, there was a significant correlation between median cost and hospital length of stay (Spearman correlation coefficient, 0.57; P < .001), with a similar partial correlation between median cost and length of stay controlling for ICU utilization (controlled r = 0.62; P < .001). The Spearman correlation coefficient between ICU utilization and median cost was nonsignificant (r = 0.03; P = .87) and the partial correlation controlling for hospital length of stay was also nonsignificant (controlled r = −0.28; P = .09). In other words, median cost was more related to hospital length of stay than ICU utilization. The correlation between length of stay and ICU utilization and trimmed length of stay was significant (r = 0.41; P = .01). Figure 2 illustrates the associations between all 3 practice style variables as well as the distribution of each practice style variable across all hospitals (eFigures 2-6 in the Supplement).

Observing Risk Synergy

Figure 3 illustrates how hospitals differ in their ability to care for patients of varied levels of risk. We defined risk in this example as predicted length of stay from a model fit to PHIS patients not included in this study (eTable 3 in the Supplement). The longer the predicted stay, the greater the risk. Each patient received a predicted length of stay based on the model that we developed. We used the model to order the template patients (eTable 7 in the Supplement provides results for all 37 hospitals).

Cost results for the smoothed plots of 4 hospitals using locally weighted scatterplot smoothing41 are displayed in Figure 3. Hospital C had lower cost and no risk synergy—it uniformly showed a specific advantage across all levels of risk. Hospital CC had a specific disadvantage, with uniformly higher costs across all levels of risk and no risk synergy. Hospital N displayed typical costs for the patients at the lowest risk, but compared with matched controls, costs declined significantly as risk increased (comparative cost difference: lowest risk, −10.12%; highest risk, −16.85%; P = .01). Finally, compared with matched controls, hospital L showed relatively low costs for low-risk patients but high costs for high-risk patients (comparative cost difference: lowest risk, −34.21%; highest risk, 53.27%; P < .001).

Discussion

Despite asthma being a disease with established protocols for treatment3 and our focus on a group of children’s hospitals, we found substantial variation among PHIS hospitals in ICU utilization, length of stay, and treatment costs in similar template-matched patients with asthma. Cost was associated more with length of stay than with ICU utilization. The lower association with ICU utilization may indicate differences in care substitutes for ICU utilization that may occur on the pediatric floor or variation in the definition of ICU care.

Template matching has not been used previously to control for asthma patient characteristics in this manner; thus, the results of this study are new. Moreover, we found that practice style variables among hospitals often differ systematically by risk of the patient. Some hospitals appear to consistently spend less on the strata comprising patients with less severe asthma than their matched controls but spend more than controls within more difficult patient strata (eg, hospital L). Furthermore, hospital L displayed overall costs that were no different from those of control facilities, so a hospital may incorrectly stop auditing there and assume they are doing an adequate job. However, hospital L performed very differently compared with the controls depending on the patient level of risk, highlighting the need to examine facilities across different levels of patient risk when studying variation in practice style and allowing for identification of potential types of patients for whom a hospital’s care can be improved. What could hospital L do next? Because template matching is based on patients rather than regression coefficients, hospital L’s quality improvement officer can identify and closely examine patients whose care seems to be worse than that of matched controls. In attempting to understand the hospital’s consumption of resources, they could investigate, at the patient level, actual practice differences that may be occurring at their institution vs the 36 other hospitals matched to the template. Did their hospital utilize ICUs as often as the matched controls did? Were there great differences in the choice of medications prescribed for similar patients? This deeper examination of the data is dependent on which data are collected and what pattern of risk synergy and specific advantage is detected.

One limitation of template matching rests in sample size requirements needed to produce good matches. Beyond expanding the time frame of the template analysis or the types of patients in the template, some hospitals may still be too small to be able to match the template or see only patients who are different from the template. For these hospitals, we suggest 2 alternative methods: hospital-specific template matching10 and indirect standardization matching.13 In these approaches, all or a sample of a hospital’s own patients compose an initial “boutique” template that is used to match patients from other hospitals to benchmark the specific institution’s results.

The template can be created to present different questions, as desired by policymakers or stakeholders. Templates can be representative, as was developed in this analysis, or targeted to specific types of patients who may be of interest. Developing the template provides unique opportunities for auditors to concentrate on challenging groups of patients.

Conclusions

Template matching allows hospitals to audit their utilization patterns in a new manner that uses closely matched patients from other hospitals. We observed large variation among children’s hospitals in resource utilization costs, length of stay, and ICU utilization relative to matched controls. The reasons for these differences varied across hospitals. Cost differences were better explained by length of stay than a hospital’s relative use of the ICU. Finally, reporting only whether a hospital is more or less costly in aggregate may overlook important comparative differences in resource use across patients’ predicted risk and therefore may miss opportunities to improve practice.

Back to top
Article Information

Accepted for Publication: March 30, 2016.

Corresponding Author: Jeffrey H. Silber, MD, PhD, Center for Outcomes Research, The Children’s Hospital of Philadelphia, 3535 Market St, Ste 1029, Philadelphia, PA 19104 (silber@email.chop.edu).

Published Online: July 11, 2016. doi:10.1001/jamapediatrics.2016.0911.

Author Contributions: Dr Silber had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: Silber, Rosenbaum, Wang.

Acquisition, analysis, or interpretation of data: All authors.

Drafting of the manuscript: Silber, Rosenbaum, Wang, Calhoun Zeigler.

Critical revision of the manuscript for important intellectual content: All authors.

Statistical analysis: Silber, Rosenbaum, Wang, Ludwig, Calhoun, Guevara.

Obtained funding: Silber, Even-Shoshan.

Administrative, technical, or material support: Silber, Calhoun, Guevara, Zorc, Zeigler, Even-Shoshan.

Study supervision: Silber, Rosenbaum.

Conflict of Interest Disclosures: None reported.

Funding/Support: This research was funded by grant U18-HS020508 from the Agency for Healthcare Research and Quality (AHRQ).

Role of the Funder/Sponsor: The AHRQ had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript, and decision to submit the manuscript for publication.

Disclaimer: The findings and conclusions of this report are those of the authors and do not necessarily represent the official position of AHRQ.

Additional Contributions: Data for this project were supplied by the Children’s Hospital Association Pediatric Hospital Information System (PHIS). The PHIS hospitals are some of the largest and most advanced children’s hospitals in the United States and have the most demanding standards of pediatric service in America. Traci Frank, AA (Center for Outcomes Research, The Children’s Hospital of Philadelphia), assisted with this research. There was no financial compensation

References
1.
Friedman  B, Berdahl  T, Simpson  LA,  et al.  Annual report on health care for children and youth in the United States: focus on trends in hospital use and quality.  Acad Pediatr. 2011;11(4):263-279.PubMedGoogle ScholarCrossref
2.
Kamble  S, Bharmal  M.  Incremental direct expenditure of treating asthma in the United States.  J Asthma. 2009;46(1):73-80.PubMedGoogle ScholarCrossref
3.
Chen  KH, Chen  CC, Liu  HE, Tzeng  PC, Glasziou  PP.  Effectiveness of paediatric asthma clinical pathways: a narrative systematic review.  J Asthma. 2014;51(5):480-492.PubMedGoogle ScholarCrossref
4.
Shanley  LA, Lin  H, Flores  G.  Factors associated with length of stay for pediatric asthma hospitalizations.  J Asthma. 2015;52(5):471-477.PubMedGoogle ScholarCrossref
5.
Bratton  SL, Newth  CJ, Zuppa  AF,  et al; Eunice Kennedy Shriver National Institute of Child Health and Human Development Collaborative Pediatric Critical Care Research Network.  Critical care for pediatric asthma: wide care variability and challenges for study.  Pediatr Crit Care Med. 2012;13(4):407-414.PubMedGoogle ScholarCrossref
6.
Parikh  K, Hall  M, Mittal  V,  et al.  Establishing benchmarks for the hospitalized care of children with asthma, bronchiolitis, and pneumonia.  Pediatrics. 2014;134(3):555-562.PubMedGoogle ScholarCrossref
7.
Morse  RB, Hall  M, Fieldston  ES,  et al.  Hospital-level compliance with asthma care quality measures at children’s hospitals and subsequent asthma-related outcomes.  JAMA. 2011;306(13):1454-1460.PubMedGoogle ScholarCrossref
8.
Chamberlain  JM, Teach  SJ, Hayes  KL, Badolato  G, Goyal  MK.  Practice pattern variation in the care of children with acute asthma.  Acad Emerg Med. 2016;23(2):166-170.PubMedGoogle ScholarCrossref
9.
Silber  JH, Rosenbaum  PR, Ross  RN,  et al.  Template matching for auditing hospital cost and quality.  Health Serv Res. 2014;49(5):1446-1474.PubMedGoogle ScholarCrossref
10.
Silber  JH, Rosenbaum  PR, Ross  RN,  et al.  A hospital-specific template for benchmarking its cost and quality.  Health Serv Res. 2014;49(5):1475-1497.PubMedGoogle ScholarCrossref
11.
Silber  JH, Rosenbaum  PR, Trudeau  ME,  et al.  Multivariate matching and bias reduction in the surgical outcomes study.  Med Care. 2001;39(10):1048-1064.PubMedGoogle ScholarCrossref
12.
Rosenbaum  PR.  Part II: Matching: Design of Observational Studies. New York, NY: Springer; 2010:153-253.
13.
Silber  JH, Rosenbaum  PR, Ross  RN,  et al.  Indirect standardization matching: assessing specific advantage and risk synergy [published online February 29, 2016].  Health Serv Res.PubMedGoogle Scholar
14.
National Heart, Lung, and Blood Institute, National Asthma Education and Prevention Program. Expert Panel Report 3: guidelines for the diagnosis and management of asthma: full report, 2007. NIH Publication No. 07-4051. Bethesda, MD: US Dept of Health and Human Services. http://www.nhlbi.nih.gov/files/docs/guidelines/asthgdln.pdf. Published August 28, 2007. Accessed July 22, 2015.
15.
Rosenbaum  PR. Basic tools of multivariate matching. section 8.3: distance matrices. In:  Design of Observational Studies. New York, NY: Springer; 2010:168-172.
16.
Rubin  DB.  Bias reduction using Mahalanobis metric matching.  Biometrics. 1980;36(2):293-298.Google ScholarCrossref
17.
R Development Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org. Accessed April 15, 2014.
18.
Zubizarreta  JR.  Using mixed integer programming for matching in an observational study of kidney failure after surgery.  J Am Stat Assoc. 2012;107(500):1360-1371.Google ScholarCrossref
19.
Zubizarreta  JR, Cerdá  M, Rosenbaum  PR.  Effect of the 2010 Chilean earthquake on posttraumatic stress: reducing sensitivity to unmeasured bias through study design.  Epidemiology. 2013;24(1):79-87.PubMedGoogle ScholarCrossref
20.
Silber  JH, Rosenbaum  PR, McHugh  MD,  et al.  Comparing the value of better nursing work environments across different levels of patient risk [published online January 20, 2016].  JAMA Surg.PubMedGoogle Scholar
21.
Rosenbaum  PR.  Fine Balance: Design of Observational Studies. New York, NY: Springer; 2010:197-206.
22.
Rosenbaum  PR, Ross  RN, Silber  JH.  Minimum distance matched sampling with fine balance in an observational study of treatment for ovarian cancer.  J Am Stat Assoc. 2007;102(477):75-83.Google ScholarCrossref
23.
Silber  JH, Rosenbaum  PR, Polsky  D,  et al.  Does ovarian cancer treatment and survival differ by the specialty providing chemotherapy?  J Clin Oncol. 2007;25(10):1169-1175.PubMedGoogle ScholarCrossref
24.
Silber  JH, Rosenbaum  PR, Kelz  RR,  et al.  Medical and financial risks associated with surgery in the elderly obese.  Ann Surg. 2012;256(1):79-86.PubMedGoogle ScholarCrossref
25.
Yang  D, Small  DS, Silber  JH, Rosenbaum  PR.  Optimal matching with minimal deviation from fine balance in a study of obesity and surgical outcomes.  Biometrics. 2012;68(2):628-636.PubMedGoogle ScholarCrossref
26.
Kruskal  W, Wallis  WA.  Use of ranks in one-criterion variance analysis.  J Am Stat Assoc. 1952;47(260):583-621.Google ScholarCrossref
27.
Rubin  DB.  The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials.  Stat Med. 2007;26(1):20-36.PubMedGoogle ScholarCrossref
28.
Rubin  DB.  For objective causal inference, design trumps analysis.  Ann Appl Stat. 2008;2(3):808-840.Google ScholarCrossref
29.
Simes  RJ.  An improved Bonferroni procedure for multiple tests of significance.  Biometrika. 1986;73(3):751-754.Google ScholarCrossref
30.
Benjamini  Y, Hochberg  Y.  Controlling the false discovery rate: A practical and powerful approach to multiple testing.  J R Stat Soc B. 1995;57(1):289-300.Google Scholar
31.
Keren  R, Luan  X, Localio  R,  et al; Pediatric Research in Inpatient Settings (PRIS) Network.  Prioritization of comparative effectiveness research topics in hospital pediatrics.  Arch Pediatr Adolesc Med. 2012;166(12):1155-1164.PubMedGoogle ScholarCrossref
32.
Bureau of Labor Statistics. Consumer Price Index. http://www.bls.gov/cpi/home.htm. Accessed March 10, 2014.
33.
Rosenbaum  PR.  Reduced sensitivity to hidden bias at upper quantiles in observational studies with dilated treatment effects.  Biometrics. 1999;55(2):560-564.PubMedGoogle ScholarCrossref
34.
Rosenbaum  PR. Models for Treatment Effects. 5.3: Dilated Effects. In:  Observational Studies. 2nd ed. New York, NY: Springer-Verlag; 2002:173-179.
35.
Bishop  YMM, Fienberg  SE, Holland  PW. Analysis of Square Tables: Symmetry and Marginal Homogeneity. In:  Discrete Multivariate Analysis: Theory and Practice. Cambridge, MA: MIT Press; 1975:281-286.
36.
Mantel  N.  Chi-square tests with one degree of freedom: extensions of the Mantel-Haenszel procedure.  J Am Stat Assoc. 1963;58(303):690-700.Google Scholar
37.
Hampel  FR, Ronchett  EM, Rousseeuw  PJ,  et al. Linear models: robust estimation. In:  Robust Statistics: The Approach Based on Influence Functions. New York, NY: John Wiley & Sons; 1986:307-341.
38.
Huber  PJ.  Robust Statistics. Hoboken, NJ: John Wiley & Sons; 1981.
39.
Tukey  JW. Schematic summaries (pictures and numbers. In:  Eighths, Sixteenths, etc: Exploratory Data Analysis. Reading, MA: Addison-Wesley Publishing Co Inc; 1977:53-54.
40.
Ripley  B, Bates  DM, Hornik  K,  et al. Package “MASS.” https://cran.r-project.org/web/packages/MASS/MASS.pdf. Updated April 22, 2016. Accessed March 17, 2016.
41.
Cleveland  WS.  Robust locally weighted regression and smoothing scatterplots.  J Am Stat Assoc. 1979;74(368):829-836.Google ScholarCrossref
×