Maas MB, Jaff MR, Rordorf GA. Risk Adjustment for Case Mix and the Effect of Surgeon Volume on Morbidity. JAMA Surg. 2013;148(6):532-536. doi:10.1001/jamasurg.2013.1509
Author Affiliations: Division of Critical Care Neurology, Department of Neurology, Northwestern University, Chicago, Illinois (Dr Maas), Division of Vascular and Critical Care Neurology, Department of Neurology (Drs Maas and Rordorf), and Section of Vascular Medicine, Division of Cardiology, Department of Medicine (Dr Jaff), Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts.
Importance Retrospective studies of large administrative databases have shown higher mortality for procedures performed by low-volume surgeons, but the adequacy of risk adjustment in those studies is in doubt.
Objective To determine whether the relationship between surgeon volume and outcomes is an artifact of case mix using a prospective sample of carotid endarterectomy cases.
Design Observational cohort study from January 1, 2008, through December 31, 2010, with preoperative, immediate postoperative, and 30-day postoperative assessments acquired by independent monitors.
Setting Urban, tertiary academic medical center.
Patients All 841 patients who underwent carotid endarterectomy performed by a vascular surgeon or cerebrovascular neurosurgeon at the institution.
Intervention Carotid endarterectomy without another concurrent surgery.
Main Outcome Measures Stroke, death, and other surgical complications occurring within 30 days of surgery along with other case data. A low-volume surgeon performed 40 or fewer cases per year. Variables used in a comparison administrative database study, as well as variables identified by our univariate analysis, were used for adjusted analyses to assess for an association between low-volume surgeons and the rate of stroke and death as well as other complications.
Results The rate of stroke and death was 6.9% for low-volume surgeons and 2.0% for high-volume surgeons (P = .001). Complications were similarly higher (13.4% vs 7.2%, P = .008). Low-volume surgeons performed more nonelective cases. Low-volume surgeons were significantly associated with stroke and death in the unadjusted analysis as well as after adjustment with variables used in the administrative database study (odds ratio, 3.61; 95% CI, 1.70-7.67, and odds ratio, 3.68; 95% CI, 1.72-7.89, respectively). However, adjusting for the significant disparity of American Society of Anesthesiologists Physical Status classification in case mix eliminated the effect of surgeon volume on the rate of stroke and death (odds ratio, 1.65; 95% CI, 0.59-4.64) and other complications.
Conclusions and Relevance Variables selected for risk adjustment in studies using administrative databases appear to be inadequate to control for case mix bias between low-volume and high-volume surgeons. Risk adjustment should empirically analyze for case mix imbalances between surgeons to identify meaningful risk modifiers in clinical practice such as the American Society of Anesthesiologists Physical Status classification. A true relationship between surgeon volume and outcomes remains uncertain, and caution is advised in developing policies based on these findings.
Retrospective studies of large administrative databases have shown an association between low hospital and surgeon volume and mortality for a variety of procedures including carotid endarterectomy (CEA).1- 4 Policy makers and business interests, such as the Leapfrog Group, cite such research as a rationale in advocating for selective referral to high-volume hospitals and surgeons.5,6 The specter of restrictive referral controls and financial consequences have fueled a debate about the validity of those retrospective analyses. The American College of Surgeons and other experts have opposed implementing policies to favor volume-based referral, strongly critiquing the risk adjustments done in the administrative database studies, and have called for new studies that would address their methodologic limitations.7,8
With approximately 100 000 procedures performed annually in the United States, CEA represents an ideal target for health care outcomes optimization. Given that revascularization is only favorable when perioperative risks are acceptably low, factors that impact the rate of postoperative complications, especially stroke and death (S/D), are of particular interest.9,10 The relationship between surgeon volume and outcomes is particularly important for CEA, where a recent widely cited study reported that mortality is driven by the effect of surgeon volume with no discernible risk associated with hospital volume.2 Those authors argued that clinical studies, as opposed to administrative data, lack sufficient statistical power to detect clinically meaningful differences in outcome and that there is little evidence for volume-related differences in case mix that would require risk adjustment.2 Authors of another study stated that no severity of illness adjustment was attempted because no classification system has been found capable of distinguishing iatrogenic complications from chronic comorbidities for this patient population.4
Assuring quality outcomes from individual surgeons is ultimately a local concern. It remains unknown whether hospital quality assurance programs could provide sufficient data to base referrals on actual observed outcomes, and suspicion of inadequately controlled bias limits support from the surgical community for making referral policy decisions by inference from analyses of large national data sets. We sought to determine whether an effect of surgeon volume on patient outcomes could be detected by a rigorous quality assurance program at a single institution for CEA. Furthermore, we wanted to determine whether risk adjustment using prospectively acquired and empirically selected clinical variables was superior to the method of risk adjustment performed on administrative data sets.
A comprehensive, independent system for monitoring CEA outcomes was initiated at Massachusetts General Hospital, a major metropolitan tertiary care academic medical center. Systematic prospective data collection began in October 2007. Quiz Ref IDData gathered on CEA cases performed from January 1, 2008, to December 31, 2010, were included in this analysis. We included only cases that were performed by board certified vascular surgeons or cerebrovascular fellowship trained neurosurgeons. Patients who underwent another invasive cardiovascular procedure concurrent with CEA (typically coronary artery bypass grafting or valve repair) were followed up using the same method but were excluded from this analysis.
All patients who underwent CEA were examined preoperatively as well as at 24 hours and at 30 days following the procedure. The examinations entailed sufficient review of the patient's medical history and functional status to allow determination of a modified Rankin Scale score and physical assessments incorporating, at a minimum, all elements of the National Institutes of Health Stroke Scale (NIHSS). All examiners held certification in NIHSS administration through the American Heart Association–American Stroke Association and were supervised by a neurologist trained in stroke and critical care management who served as co-chair of the Vascular Center Quality Assurance committee (G.A.R.). To eliminate the possibility of bias, the monitors were not part of the treating team.
Quiz Ref IDAn electronic quality assurance data registry developed by the Massachusetts General Hospital Vascular Center was prospectively populated for each CEA patient to code and document all relevant medical history, risk factors, anatomic study results, procedure details, and outcome variables with precise detail. Data that could not be obtained prospectively through direct assessments by staff were obtained by contemporaneous abstraction from the medical record. Outcomes were adjudicated by a vascular neurologist (G.A.R.), a neurointerventional radiologist, a vascular medicine specialist (M.R.J.), a vascular surgeon, and an interventional cardiologist, all of whom had experience managing patients with extracranial carotid artery disease.
Quiz Ref IDElective CEA was defined as a planned, scheduled CEA, in contrast to nonelective CEA, which was performed urgently after a stroke or transient ischemic attack attributed to the index carotid artery. Symptomatic status was defined as per the North American Symptomatic Carotid Endarterectomy Trial.11 A low-volume surgeon was defined as a surgeon who completed 40 or fewer CEAs per year averaged over the study period. This threshold was chosen for consistency with a comparison study that used a large administrative database to empirically determine a meaningful surgeon volume threshold.2 The degree of stenosis was determined by carotid duplex ultrasonography for methodologic consistency, given that all patients were evaluated with that modality. Complications under surveillance included death, stroke, transient ischemic attack, hyperperfusion syndrome, seizure, clinically evident cranial nerve injury, myocardial infarction, and failure of the revascularization procedure, including arterial dissection, pseudoaneurysm formation, patch tear, artery occlusion or surgical site infection, or bleeding requiring extended hospitalization for observation, treatment, or operative exploration. In addition, any condition that occurred within 30 days of the CEA necessitating treatment that extended the duration of hospitalization or led to nonelective hospital readmission was included as a complication.
The baseline characteristics and outcomes were compared between cases performed by low-volume and high-volume surgeons. The Fisher exact test was used for categorical variables, Mann-Whitney U test for nonnormally distributed variables, and t test for normally distributed continuous variables. We undertook 2 methods of determining adjusted risk for the outcome variables of interest—S/D and other complications. First, we prospectively used the variables that had been used for risk adjustment in a comparison large administrative database retrospective study, which included age group in 5-year intervals, sex, race/ethnicity (black or nonblack), and year of procedure.2 That study by Birkmeyer et al2 also adjusted for mean Social Security income from the subjects' ZIP code and other hospital characteristics. Those geographic and hospital adjustments were not included in our model because all surgeons drew from the same referral area and operated at the same hospital. That study was selected because it is the most heavily cited publication addressing this question, and the authors explicitly argued for the theoretical adequacy of their risk adjustment technique in the face of prior criticisms.2 Their prospectively selected variables were included along with surgeon volume in a binary logistic regression model for S/D and other complications. For the second method of adjusting risk, we performed a univariate statistical analysis using appropriate tests as previously detailed, and predictors of S/D and other complications with a significance of P < .10 were included in binary logistic regression models for each of those 2 outcomes. The regression models were determined to have adequate goodness of fit, if the result of the Hosmer and Lemeshow test was not significant.
This data had initially been obtained for a hospital-initiated quality assurance program. We obtained a waiver from the institutional review board for use of the data in this study.
There were 841 cases in the study cohort, and 29% of procedures were performed by a low-volume surgeon. The characteristics of the patients are summarized in Table 1. There was a significantly higher proportion of symptomatic and nonelective cases operated on by low-volume surgeons. The rate of complications was higher in the low-volume surgeon group (13.4% vs 7.2%, P = .008), as was the combined rate of S/D (6.9% vs 2.0%, P = .001).
Quiz Ref IDThe unadjusted odds ratio (OR) for S/D for low-volume surgeon cases was 3.61 (95% CI, 1.70-7.67). After adjustment for the prospectively chosen variables used in the administrative database study, the adjusted OR for S/D for low-volume surgeon cases was 3.68 (95% CI, 1.72-7.89). None of the other variables showed a statistically significant association.
Quiz Ref IDUnivariate assessment identified the American Society of Anesthesiologists (ASA) Physical Status classification and low-volume surgeon as potentially associated with S/D. In addition to ASA class and low-volume surgeon, a history of stroke in the index carotid artery territory, preoperative modified Rankin Scale score, and NIHSS score were identified as potentially associated with other complications. Although the unadjusted association with poor outcomes was stronger for low-volume surgeons than any other variables, after adjustment, cases performed by a low-volume surgeon showed no increased odds for S/D or other complications (OR, 1.65; 95% CI, 0.59-4.64, for S/D and OR, 1.42; 95% CI, 0.72-2.82, for complications), whereas a clear risk association was seen with higher ASA class (OR, 2.78; 95% CI, 0.96-8.06; P = .06 for S/D and OR, 2.08; 95% CI, 1.08-4.02; P = .03 for complications). The result of that analysis is summarized in Table 2.
Our institutional quality assurance program found significantly higher rates of S/D and other complications in CEA cases performed by low-volume surgeons. Adjusting for variables used in a widely cited administrative database study that observed a similar effect, the risk associated with low-volume surgeons persisted at a nearly identical OR. In contrast, our empirical analysis identified ASA class scores as another important predictor; after adjustment with a logistic regression model, no effect of surgeon volume on outcomes could be identified. We conclude that case mix differs between low-volume and high-volume surgeons, and that the risk adjustment technique used in prior important studies is insensitive to important covariates, greatly overestimating the impact of surgeon volumes on meaningful outcomes.
Many confounders impact the relationship between surgeon volume and outcomes, such as the fact that a disproportionately higher number of emergent cases are handled by low-volume surgeons, dysfunctional systems of care may exist even in high-volume practices, and specialized surgeons appear to have lower complication rates.3,12,13 The ability to risk adjust using administrative databases is limited, and given the multiple assumptions required when analyzing those cohorts, generalization of findings to a local practice level is difficult.14,15
Some researchers who have reported an association between surgeon volume and CEA outcomes performed no severity of illness adjustment, stating that no administrative data abstraction system has been found to be capable of distinguishing true premorbid severity.4 In contrast, Birkmeyer et al,2 whose risk adjustment technique we used for comparison on our data, argued that no clinical studies had presented evidence for volume-related differences in case mix that would bias their findings, so limitations related to risk adjustment should not impact their observations. Furthermore, they proposed that administrators actively manage the distribution of certain operations within their hospitals in favor of high-volume surgeons to optimize outcomes.2 Our data supports the critique of the American College of Surgeons and various experts that the case mix of low-volume vs high-volume surgeons is not uniform, and that differences in case mix, captured in this study as ASA class, may account for the unadjusted higher rate of complications after CEAs performed by low-volume surgeons in ways that risk adjustment techniques used in administrative database studies are insensitive to. It appears that restricting CEA to high-volume surgeons at our institution could lead to limited access to care for the greater proportion of urgent cases managed by low-volume surgeons, without leading to improved outcomes.
We believe our data contribute important information owing to several methodologic characteristics. First, all procedures took place in the same hospital using the same system of care including nursing staff, residents, and institutional protocols. All health care providers drew from the same community, and all had undergone specialized training to perform CEA. Prior studies evaluating the effect of surgeon volume did not correct for the bias of surgeon specialty, although specialized training is associated with better outcomes.2 This approach to using single institution data inherently holds many other known variables associated with patient outcomes constant. Second, a prospective approach to data collection was used, which avoids all of the assumptions involved with reconstructing patient characteristics, surgeon characteristics, and outcomes from billing codes. This method also allowed us to evaluate objective and validated clinical scales such as NIHSS score and ASA class. Third, we used data from a real hospital quality assurance program, and, in so doing, proved the feasibility of using a routine hospital operations system to evaluate meaningful outcomes by individual surgeon variables. Finally and crucially, we were able to obtain and analyze prospective data from low-volume surgeons. Nearly all clinical studies on CEA have come from trials that actively recruited with strong bias toward high-volume surgeons, often requiring documentation of a certain volume with favorable outcome.16,17 Our data are unique in that they are prospective and comprehensive with methods similar to a clinical trial, yet included a typical community sample, excluded no patients for unfavorable characteristics, and involved a sizeable number of low-volume surgeon cases.
The cautious scrutiny of the Leapfrog Group's initiative and the data supporting it appears to have merit. Despite the methodologic strengths that allowed us to best address the aims of this project, it would be inappropriate to likewise extend the findings of this study to conclusions that are beyond the scope of our methods to answer. While we have shown that a clinical study can indeed demonstrate an unadjusted effect of surgeon volume on outcome and demonstrated the magnitude of the importance of risk adjustment for volume-related differences in case mix, this study does not have the power to conclude that no effect of surgeon volume on outcomes exists. It is reasonable to state that if an effect does exist, the magnitude is likely much less than earlier studies with suboptimal risk adjustment report. Furthermore, an association between surgeon volume and outcomes could be more prominent for surgeons without subspecialty training, a group we did not study. Given the association between low-volume providers and case urgency and the adjustment effect of ASA class, low-volume surgeons were clearly treating riskier patients. We presume this reflects the effect of pooling adequately trained surgeons in a coverage system for which a portion of the cases may be low volume for particular health care providers. We cannot determine whether this association may also reflect different judgment in selecting cases owing to less clinical experience. Finally, the use of a single institutional cohort was optimal to meet the aims of our study but may not generalize to hospitals and surgeon groups with substantially different characteristics.
Fears that convenient but imprecise metrics could be used to disenfranchise highly skilled surgeons have been exacerbated by initiatives to develop and enforce selective referral policies based on extrapolations from methodologically limited analyses.18 We have shown that identifying concerning trends by surgeon characteristics is feasible with a strong hospital quality assurance program, and that risk adjustment using prospectively acquired clinical assessments is superior to those used for administrative databases. Contrary to the assumptions underlying other studies' methods, we found that case mix can vary meaningfully by surgeon volume, and that the difference in case mix accounts for much of the apparent difference in outcomes. While risk adjustment with severity indices is imperfect, research specifically on the role of risk adjustment for the interpretation of outcomes research has clearly demonstrated that optimizing risk adjustment is imperative to avoid penalizing health care providers who treat high-risk patients and to establish credible data for quality improvement.19 It would be detrimental to the health care system to exclude adequately trained practitioners with different elective practices from providing urgent surgical care to sicker patients. In our cohort, the risk assessment of the anesthesiologist, coded as the ASA class, appeared to be most predictive of outcomes. Risk adjustment should empirically analyze for case mix imbalances between surgeons to identify meaningful risk modifiers in clinical practice. Prespecifying risk adjustment variables without studying the case distribution between surgeon groups is less evidence based and appears inadequate. This significant public health concern is inextricable from the local institution and ultimately depends on individual surgeons and anesthesiologists in a way that seems to escape detection in large, limited data sets. Perhaps the informal grassroots system of physician referrals that operates in communities across the country has some validity after all.18
Correspondence: Matthew B. Maas, MD, Department of Neurology, Northwestern University, 710 N Lake Shore Dr, 11th Fl, Chicago, IL 60611 (firstname.lastname@example.org).
Accepted for Publication: November 7, 2012.
Published Online: February 20, 2013. doi:10.1001/jamasurg.2013.1509
Author Contributions:Study concept and design: Maas and Rordorf. Acquisition of data: Maas and Rordorf. Analysis and interpretation of data: Maas, Jaff, and Rordorf. Drafting of the manuscript: Maas. Critical revision of the manuscript for important intellectual content: Maas, Jaff, and Rordorf. Statistical analysis: Maas. Obtained funding: Rordorf. Administrative, technical, and material support: Maas and Rordorf. Study supervision: Jaff and Rordorf.
Conflict of Interest Disclosures: Dr Jaff is an uncompensated advisor for Abbott Vascular, Cordis, Covidien/eV3, and Medtronic Vascular.
Funding/Support: This study was supported by the Massachusetts General Hospital Quality Assurance Program.