Objective
To compare outcomes of patients undergoing bariatric procedures in hospitals designated as centers of excellence compared with nondesignated hospitals.
Design
The 2005 National Inpatient Survey was used to compare outcomes at designated vs nondesignated hospitals. In addition to conventional null-hypothesis statistical testing to assess differences, effect sizes were calculated to estimate the clinical significance for observed differences.
Results
Centers of excellence performed substantially more operations than nondesignated centers. Despite this, outcomes were equivalent at centers of excellence and hospitals without this designation. Volume-outcome modeling attempting to identify the optimal number for a minimum volume threshold for bariatric operations revealed that annual procedure volume has a weak effect on outcomes. Similarly, many variables that were statistically significantly different between centers and noncenters proved to be clinically unimportant by effect size analysis. Risk adjustment was effectively achieved by using the Agency for Healthcare Research and Quality–supplied variables all-payer severity-adjusted diagnostic related group expected charges and deaths.
Conclusions
Designation as a bariatric surgery center of excellence does not ensure better outcomes. Neither does high annual procedure volume. Extra expenses associated with center of excellence designation may not be warranted.
Bariatric surgery centers of excellence (COE) were promoted by the Centers for Medicaid and Medicare Services (CMS) in 2006. A new Bariatric Surgery National Coverage Determination included the requirement that CMS-insured patients undergo bariatric procedures in a COE designated by either the American College of Surgeons or the American Society for Metabolic and Bariatric Surgery.^{1} This recommendation resulted from concern about the safety of bariatric procedures, especially for high-risk patients such as those typically covered by the CMS.
The CMS did not specify what would constitute a COE. The American College of Surgeons and the American Society for Metabolic and Bariatric Surgery established criteria for COE status, with each organization issuing nearly identical guidelines.^{2}^{- 4} Meeting both sets of criteria requires a certain amount of program support and coordination, and both require entry of outcomes data into proprietary databases and the performance of at least 125 operations per year. These criteria make intuitive sense but lack an evidence base for their application.^{5} Program characteristics thought to be related to optimal outcomes for bariatric surgery were delineated in a 2004 Betsy Lehman Center report on patient safety.^{6} Because there have been few studies in this area, this report primarily relied on expert opinion as evidence for recommendations. The relationship between the recommended structural elements for a bariatric surgery program and its outcomes are not known. I hypothesize that bariatric surgery programs fulfilling the requirements necessary to become a designated COE will have better outcomes than nondesignated centers.
Database acquisition and case identification
Data from the 2005 National Inpatient Survey (NIS) were obtained from the Agency for Healthcare Research and Quality.^{7} Most programs with COE status obtained it in 2006 based on their performance for the prior year. The NIS is a population-representative sampling of hospital discharge data that includes 20% of all the hospitalizations in the United States in any given year. In contrast to the other large hospital discharge database (the National Hospital Discharge Survey), which samples a fraction of discharges from any given hospital, the NIS obtains information about all discharges from a select number of facilities in the United States. This method has the advantage of including the full spectrum of activity from hospitals in various regions. Although the NIS is statistically corrected to be population representative, data are only collected from hospitals in 29 states, and ethnicity data are often incomplete. Therefore, there are regions and populations that may not be well represented in the NIS. The major advantage of the NIS is the completeness of information from individual hospitals, which facilitates volume-outcome analysis. The NIS also includes discharge information from a variety of hospital types, including small community facilities and large academic teaching hospitals.
To best reflect the primary reason a patient was hospitalized, patients are assigned to a diagnosis related group (DRG) at the time of discharge based on diagnostic and procedure codes, as well as information from the patient's medical record. Bariatric procedures were identified as discharges encoded with DRG 288. The code DRG 288 (operating room procedures for obesity) is only used when the primary reason for hospitalization is to undergo procedures related to morbid obesity. Therefore, any patient encoded with DRG 288 was admitted with the intent to perform an operation for a problem related to a patient's morbid obesity. Procedures associated with plastic surgery, ie, those with ICD-9-CM procedure codes ranging from 85.XX to 86.XX, were excluded.^{8} Similar to a prior approach for case identification,^{9} operations identified with ICD-9-CM codes from 43.80 to 44.98 and DRG 288 were defined as bariatric surgical procedures.
Poverty was defined as living in a zip code region with a median annual income of less than $35 000, which since 2003 has been the Agency for Healthcare Research and Quality income threshold for defining poverty. Elderly was defined as being older than 65 years.
The bariatric procedure case volume for each facility was counted for the year 2005. Deaths were identified from the NIS variable DIED (ie, died while in the hospital). Complications were defined by NIS clinical classification software diagnostic variable 238, “complications of surgical procedures or medical care.”^{10}
Risk adjustment was achieved by use of the all-payer severity-adjusted DRGs (APS-DRGs) normalized charge, length of stay, and mortality weights file version 22, as supplied by the NIS.^{11}^{,12} An APS-DRG combines information about a patient's principal diagnosis and age, as well as secondary diagnoses and procedures to estimate the magnitude of a patient's illness and risk of complications and death. Hospital charges are used as a proxy for a patient's risk for adverse outcomes because charges will be somewhat proportional to the totality of care delivered to an individual patient. The NIS provides a weighting scheme for observed hospital charges that combines the APS-DRG category, which predicts the magnitude of a patient's disease burden, with the observed charge that has been corrected for variability of charges between hospitals for similar conditions and the wage index for each hospital in the database. The resultant index predicts a patient's anticipated resource utilization given his or her overall state of health and the procedures performed.
Expected mortality was calculated by multiplying the APS-DRG variable APSDRG_Mortality_Weight by 0.02239 for each record. Expected hospital charges were calculated by multiplying the APS-DRG variable APSDRG_Charge_Weight by $16 995.17. Because expected charges include the cost of care that would be incurred were treatment complications to develop, expected charges were used to risk-adjust bariatric procedure complication calculations.
The COEs were as identified on the CMS Web site (http://www.cms.hhs.gov/MedicareApprovedFacilitie/BSF/list.asp). Hospital names were hand-matched to the list of hospitals included in the 2005 NIS. Any NIS records that did not have an associated hospital name were excluded from the analysis.
Database management and statistical analysis were performed using SAS statistical software, version 9.1 (SAS Inc, Cary, North Carolina). Proportions were tested by χ^{2} analysis, and the statistical significance for the difference between mean values was tested with t tests. When variances were unequal, differences between mean values were analyzed using t tests for unequal variance.
Effect sizes for comparing mean values were calculated by dividing the difference between means by the standard deviation^{13}:
where
If d is less than 0.20, then the difference is considered small. If d ranges from 0.20 to 0.50, then the difference is considered medium. If d is more than 0.50, then the difference is considered large. Effect sizes for proportions were calculated by:
where h is the effect size and ϕ is the arcsine transform of the proportion being evaluated. The arcsine transform is calculated by
where p is the proportion. Effect sizes of 0.2 and smaller are defined as small, of 0.5 as medium, and of 0.8 or more as large.^{14}
Logistic regression modeling was used to assess the effect of COE status and annual hospital procedure volume on bariatric procedure morbidity and mortality. In-hospital death was the dependent variable, and COE status, annual procedure volume, age, and male sex were also entered into the regression equations. Age and male sex were included because these are well-recognized risk factors for poor outcomes from bariatric procedures. The expected death rate was also entered into the regression equation to serve as a risk adjuster. Morbidity was modeled by entry of the presence or absence of complications as the dependent variable. Risk adjustment was achieved by entering the expected probability for complications. The probability for complications was estimated by multiplying the grand mean complication rate times the APS-DRG expected charges for an individual patient divided by the median charge value for the entire cohort. All continuous variables were normalized to a mean of 0 and SD of 1 to facilitate comparison of odds ratios (ORs) between variables.
Further exploration of the relationships between deaths and the complications and variables found to be significantly related to them with logistic regression was performed with generalized additive model regression (GAM) techniques (proc GAM; SAS statistical software, version 9.1). We used GAM with spline fitting as an exploratory data analysis technique to establish the complexity of the independent-dependent variable relationship. Logistic regression using SAS proc GENMOD was then applied to the data, incorporating the refined parameter information into the new regression equation.
Expected complication rates were derived from the APS-DRG normalized charges for each patient. The probability of the expected number of complications for a patient was calculated by multiplying the overall observed complication rate for the entire cohort (6.4%) by the expected charges for that patient divided by the median expected charge for the entire cohort ($23 917). The expected number of complications for a given hospital was calculated by summing the expected complication probabilities for each patient undergoing a bariatric procedure at that facility. The observed number of complications for a hospital was the sum of the observed complications for each patient, with the patient designated as either having a complication or not (not the total number of complications observed for each patient). Ratios of observed-to-expected (O/E) morbidity were then computed for each hospital. A 90% confidence interval (CI) was computed for each O/E ratio by using an exact method to compute the CI for a binomial proportion.^{15}^{- 17} Hospitals whose O/E ratios did not encompass 1.0 were considered outliers.
The effect of the use of continuous vs categorical procedure volume variables on regression results was assessed by using SAS proc LOGISTIC with the VALUES option. This enables the regression to treat the volume variable as a continuous variable (unit = 1) or as categories of defined size (ie, categories with units of every 25, 50, or 125 cases per year per hospital). These regressions were performed using nonnormalized data and, for the sake of comparison with previously reported volume-outcome studies, were run without the DESCEND option, such that the risk for complications per decreasing unit of volume was a number greater than 1. Two models were run: a minimal model with only the hospital volume as a dependent variable and a fully loaded model with volume, the quadratic volume term, patient age, male sex, COE status, teaching hospital status, Medicaid status, and poverty status entered into the regression equation as risk adjusters.
Twenty-four of 253 named hospitals (9.5%) in the NIS 2005 database were designated as bariatric surgery COEs. Of the 19 363 bariatric operations performed in these hospitals, 5420 (28%) were performed in COEs. As of October 4, 2007, the date the analysis was performed, there were 317 COEs listed in the CMS database. The sample used in this analysis included 7.6% of all COEs.
The overall in-house mortality was 0.1%, and the complication rate was 6.4%. The mean and variance for these variables were equal, indicating that they followed a Poisson distribution. Patient and program characteristics are presented in Table 1. The mean number of cases performed per facility was substantially greater for COEs than for the hospitals that were not COEs. Although there were statistically significant differences between the types of hospitals in patient age, the proportion of elderly patients, male sex, patient Medicaid status, patient poverty status, and teaching hospital status, the absolute differences and effect sizes were small. Hospital costs at the COE facilities were significantly higher, but, as indicated by the small effect size, this difference is not clinically important.
Logistic regression for mortality and complication rates is presented in Table 2. With risk adjustment by inclusion of the APS-DRG expected mortality rate in the regression equation, none of the factors tested were predictive of mortality. The APS-DRG variable expected charge was used as a risk-adjustment variable. It was significantly correlated with morbidity, with an OR (95% CI) of 1.78 (1.72-1.84) (P <.001). Procedure volume was inversely related to complication rates, whereas patient age and teaching hospital status were positively associated with postoperative complications. Further exploration of the volume-complication rate was carried out with GAM regression, using spline smoothing functions; GAM revealed that there was a significantly nonlinear relationship between the APS-DRG expected mortality rate and complications (Figure 1). Figure 1 also reveals that, although there is a procedure volume–complication rate relationship, it is weak. When a logistic regression equation was solved using quadratic terms for the expected mortality rate, the c-index improved to 0.856, and age lost statistical significance in the model. The OR (95% CI) for procedure volume in relation to complication rates was 0.84 (0.78-0.90) (P <.001).
Figure 2 shows the volume-outcome effect for complications and the expected complication rate and 95% CIs for any given hospital volume based on the Poisson distribution. As hospital volumes decrease, the expected complication rate increases with profound expansion of the CIs as the volumes become very low. There are approximately as many hospitals above the 95% CIs as there are below them, suggesting that hospitals may exhibit higher or lower than expected complication rates irrespective of their procedure volume status.
Given that there was a statistically significant hospital volume effect for complications, we examined the effect of arbitrarily assigning hospital volumes to predefined categories on the resultant ORs (Table 3). The minimal model shows that, as one considers the procedure volume-effect in single unit increments, the OR barely reaches significance. With aggregation into larger volume units, the OR appears larger. Bariatric procedures conducted at hospitals performing fewer than 125 cases per year would appear to have an 18% greater likelihood of being associated with complications. This is reduced to 10% when the model is improved by addition of the appropriate quadratic terms and risk adjusters in the regression equation.
If there is a volume effect, what volume should be used as a threshold for certifying bariatric surgery centers? This question was addressed by O/E ratio calculation. I used the APS-DRG expected charges as a proxy for prediction that complications might develop. Regression modeling demonstrated that inclusion of this variable in the complication regression model resulted in a very high c-index, suggesting that it serves as a good predictor for untoward events. The O/E ratio APS-DRG charges were calculated, as were the 95% CIs (Figure 3). Statistical significance was observed only for those facilities performing more than 500 procedures per year.
Substantially more bariatric operations were performed at COEs than hospitals not having that designation. Despite their higher volumes and COE designation, outcomes were equivalent to those at facilities not designated as COEs. Patient care costs were statistically significantly higher at COEs, but effect size measurements suggested that these differences were not clinically significant. It has been shown that the minimal annual procedure volume required to be designated as a COE (125 cases per year) does not necessarily result in better outcomes and that the minimum volume requirement is not evidence based.^{5} Most important, this volume criterion significantly restricts access for bariatric surgery care.^{5} The number of bariatric operations performed each year was the most striking difference between bariatric surgery COEs and hospitals that were not COEs. Patient and facility characteristics were similar as were complication and death rates. These findings demonstrate that COE designation is not associated with better outcomes, despite the higher annual procedure rate associated with this designation.
Designation as a bariatric surgery COE requires a significant amount of personnel and infrastructure support. All COEs must have a bariatric surgery coordinator, personnel dedicated to data entry into proprietary databases, personnel devoted to following up patients long term, and subscription to one of the database services used to track bariatric surgery outcomes. No evidence exists that these program structural elements translate to better outcomes. Criteria such as entry of outcomes data into proprietary databases result in substantial program costs yet do not have a clear relationship to surgical outcomes. Neither the American Society for Metabolic and Bariatric Surgery bariatric outcomes longitudinal database nor the American College of Surgeons National Surgical Quality Improvement Program databases have been shown to improve bariatric surgery outcomes. Assumptions have been made that use of these databases will mimic the successes of the US Department of Veterans Affairs National Surgical Quality Improvement Program experience. This is not likely, because the Department of Veterans Affairs oversees a self-contained health care system that has a central authority with the ability to intervene in underperforming surgical programs, a process not possible in the private sector. In an era where most hospitals have operational deficits and physician reimbursement is falling, requiring additional costs in the name of improved quality should only be imposed if those expenses can be irrefutably justified by their benefit in terms of improved outcomes. The present study suggests that expenses related to the structural elements required to achieve bariatric surgery COE status may not be justified.
Because of the large numbers of patients studied, most of the differences observed for between-groups variables were statistically significant. This is to be expected with a large database analysis in which, because of large numbers of patients, small absolute differences tend to be statistically significant.^{18}^{- 20} In its most general usage, the term statistical significance simply means that a difference exists between 2 populations given some a priori definition of the degree of certainty that such a difference exists. Most tests used to quantify differences (ie, the t test) were developed to extrapolate results from groups with small sample sizes to the entire population and establish whether statistically significant differences are likely. When applied to groups that are big enough to approximate the entire population, small absolute differences are frequently interpreted as being statistically significant. Effect sizes are calculated to provide a sense for how big a difference exists between groups when a statistically significant result is found.^{13}^{,14,21} For the present study, effect sizes were small for all measured variables except for annual procedure volume, suggesting that, although most of the differences found between COEs and non-COEs were statistically significant, they are not of much clinical significance.
The largest difference between the COEs and non-COEs was procedure volume. Procedure volume was also significantly related to complication rates. The question was asked: If procedure volume is related to complication rates, what should the annual procedure volume be for COE designation? One method used to answer this question was GAM. Most regression techniques treat the variable coefficients as constants to be determined by the regression process. In contrast, GAM uses a function instead of a constant, giving the regression process added flexibility and allowing further exploration of the relationships between variables. Using this technique, I had anticipated an inflection in the curve describing the complication–procedure volume relationship at the point where the annual procedure volume would affect outcomes. Figure 1 shows that inflections occurred in the complication–expected charges but were not very distinct for the complication–procedure volume curve. This latter curve implies that the complication–procedure volume relationship is weak. The O/E ratios were calculated from each hospital's observed and expected complication rates. None of these reached statistical significance for low-volume facilities and only reliably were less than 1.0 (implying better than expected outcomes) for facilities exceeding 500 bariatric surgery cases per year. Finally, an individual hospital’s complication rates were plotted against procedure volume along with the expected observed rates and their 95% CIs based on the Poisson distribution. Count data such as those used for assessing complication rates follow the Poisson distribution, meaning that at low procedure volumes substantial uncertainty exists regarding what the true complication rate is for a facility given the uncertainty related to sampling phenomenon.^{5} As seen in Figure 2, the very wide CIs observed for low-volume hospitals preclude definite conclusions regarding the volume-outcome relationship. The complication–procedure volume relationship was also inconsistent with as many hospitals being high outliers as low outliers. Had low volumes been consistently associated with more complications, one would have expected more low-volume facilities to have complication rates outside of the Poisson 95% CI.
There are important limitations to this study. I only examined inpatient complication rates, yet the very short lengths of stay associated with bariatric surgery suggest that when complications occur they may manifest as readmissions. This effect should be equally distributed among COEs and non-COEs. I did find a complication rate of 6.4%, a figure consistent with several recent studies of bariatric procedure complication rates. Because bariatric surgery COE criteria emphasize facility requirements directed at patient safety during the operation, one would expect to find fewer complications in COEs than in non-COEs, which should have been apparent in the index hospitalization. Another limitation was our sampling of 7.6% of the total number of COEs. Although a small sample, the centers included in the study were responsible for 30.0% of all the bariatric procedures in the cohort studied. I found that COEs perform substantially more bariatric procedures than non-COEs, yet both had commensurate outcomes. These findings suggest that the much larger number of hospitals that perform low volumes of bariatric procedures have outcomes similar to the high-volume COEs.
Correspondence: Edward H. Livingston, MD, Division of Gastrointestinal and Endocrine Surgery, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Rm E7-126, Dallas, TX 75390-9156 (edward.livingston@utsouthwestern.edu).
Accepted for Publication: March 3, 2008.
Financial Disclosure: None reported.
Funding/Support: This study was funded by the Hudson-Penn endowment, grants P20RR020691 and PL1DK081183 from the National Institutes of Health, and grant VA HSRD IIR-05-201 from the Department of Veterans Affairs.
Additional Contributions: Tracy Schifftner-Smith, MS, of the Department of Veterans Affairs National Surgical Quality Improvement Program supplied the O/E ratio algorithm.
2.Livingston
EH Can't we all get along? the need to unify our efforts at bariatric surgery center accreditation.
Surg Obes Relat Dis 2006;2
(5)
565- 566
PubMedArticle 5.Livingston
EHElliott
ACHynan
LSEngel
E When policy meets statistics: the very real effect that questionable statistical analysis has on limiting health care access for bariatric surgery.
Arch Surg 2007;142
(10)
979- 987
PubMedArticle 8.World Health Organization, International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM). Geneva, Switzerland World Health Organization2007;
9.Livingston
EH Procedure, incidence and in-hospital complication rates of bariatric surgery in the United States.
Am J Surg 2004;188
(2)
105- 110
PubMedArticle 13.Cohen
J A power primer.
Psychol Bull 1992;112155- 159
Article 14.Crosby
RARothenberg
R In STI interventions, size matters.
Sex Transm Infect 2004;80
(2)
82- 85
PubMedArticle 15.Breslow
NEDay
NE Statistical Methods in Cancer Research: the Design and Analysis of Cohort Studies. Vol II Lyon, France International Agency for Research on Cancer1987;
16.Khuri
SFDaley
JHenderson
W
et al. Risk adjustment of the postoperative mortality rate for the comparative assessment of the quality of surgical care: results of the National Veterans Affairs Surgical Risk Study.
J Am Coll Surg 1997;185
(4)
315- 327
PubMed 17.Khuri
SFDaley
JHenderson
W
et al. The National Veterans Administration Surgical Risk Study: risk adjustment for the comparative assessment of the quality of surgical care.
J Am Coll Surg 1995;180
(5)
519- 531
PubMed 18.Livingston
EHCassidy
L Statistical power and estimation of the number of required subjects for a study based on the
t test: a surgeon's primer.
J Surg Res 2005;126
(2)
149- 159
PubMedArticle 19.Livingston
EH The mean and standard deviation: what does it all mean?
J Surg Res 2004;119
(2)
117- 123
PubMedArticle 20.Livingston
EH Who was Student and why do we care so much about his
t test?
J Surg Res 2004;118
(1)
58- 65
PubMedArticle 21.Nickerson
RS Null hypothesis significance testing: a review of an old and continuing controversy.
Psychol Methods 2000;5
(2)
241- 301
PubMedArticle