Kaplan-Meier (unadjusted) survival estimates for a cohort of 12 216 Medicare patients with stage I or II breast cancer, by average annual surgeon volume of Medicare breast cancer cases.
Nattinger AB, Laud PW, Sparapani RA, Zhang X, Neuner JM, Gilligan MA. Exploring the Surgeon Volume–Outcome Relationship Among Women With Breast Cancer. Arch Intern Med. 2007;167(18):1958-1963. doi:10.1001/archinte.167.18.1958
Copyright 2007 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2007
A relationship between higher surgeon volume and lower mortality has been described for breast cancer, but selection bias has not been rigorously evaluated. We studied potential bias in the surgeon volume–outcome relationship by comparing the relationship of surgeon volume to breast cancer mortality and to mortality from other causes of death.
We conducted an observational cohort study from tumor registry and Medicare claims data on 12 216 women, 66 years or older, with stage I or II breast cancer, who were operated on by 1856 surgeons. Breast cancer mortality and other-cause mortality were determined from death certificate sources and surgeon volume from Medicare claims.
Treatment by a high-volume surgeon was associated with younger patient age, white race, less comorbidity, and residence in a more affluent zip code. Patients treated by low-, medium-, and high-volume surgeons had small differences in breast cancer mortality (17.4, 15.7, and 13.0 deaths per 1000 person-years, respectively; P = .03) but larger differences in non–breast cancer mortality (46.0, 36.8, and 31.7 deaths per 1000 person-years, respectively; P < .001). After adjustment for multiple patient and disease factors, women treated by high-volume surgeons, compared with those treated by low-volume surgeons, were not less likely to die of breast cancer (relative risk, 0.94; 95% confidence interval, 0.76-1.16) but were significantly less likely to die of other causes (relative risk, 0.86; 95% confidence interval, 0.75-0.98).
The surgeon volume–outcome relationship for these patients with breast cancer was attributable not to mortality from breast cancer but to other causes of death. The lack of specificity of this relationship raises the possibility of selection bias as an explanatory factor.
A positive relationship between higher volumes of cases and better health outcomes has been described for a number of different surgical procedures1- 3 and has led to the implementation of policy strategies to regionalize certain procedures.4- 6 Policies to promote regionalization are grounded in the belief that the explanation for positive volume-outcome relationships is greater proficiency of the high-volume physician or hospital, with respect to either physician technical skill or better organized care processes.7,8 This is an attractive explanation with intuitive appeal. However, alternative explanations are possible.9 One such alternative is selection bias (ie, the possibility that high-volume physicians or hospitals attract patients at lower risk of an adverse outcome).8,10 The studies of volume-outcome relationships have been observational, and selection bias is always a threat to observational studies.
There are several different types of selection bias that require consideration. These include selection based on disease severity, selection based on other medical conditions (comorbidity), and selection based on socioeconomic status, a factor that is known to influence mortality and other outcomes. With respect to cancer, the available studies have typically controlled for disease severity, and some have controlled for comorbidity or socioeconomic status using administrative data.8,10 The volume relationships are typically reported to persist after controlling for these factors. Unfortunately, it is not possible to control fully for comorbidity using administrative data.11,12 The ability to control for socioeconomic factors using administrative data is also limited, with such factors generally available at the community level, if at all.
In this study, we used breast cancer as an opportunity for studying potential bias in the surgeon volume–outcome relationship. Breast cancer is a useful condition for study for several reasons. Population-based data sources are available, reliable information is available regarding extent of disease, and data are available permitting the analysis of overall and disease-specific mortality. Three previous studies13- 15 of the surgeon volume–outcome relationship for breast cancer found that patients of high-volume (≥ 15-30 patients annually) surgeons had about a 15% reduction in overall mortality, but these studies did not examine disease-specific mortality. We believed that if the breast cancer surgeon volume–outcome relationship were mediated by better quality of cancer care, the relationship of volume to breast cancer–specific mortality should be stronger than the relationship of volume to overall mortality. Analogously, one would expect little or no relationship of surgeon volume to non–breast cancer mortality. If, however, selection bias accounted for some or all of the relationship of volume to overall mortality, then the relationship of volume to breast cancer mortality would be less specific.
A database, provided by the National Cancer Institute, of linked information from the Surveillance, Epidemiology, and End Results (SEER) Tumor Registry and Medicare administrative data16,17 was used to identify patients. The population-based SEER registries collect information on demographic characteristics, extent of disease, and initial treatment for incident cancer patients. A linkage between SEER and Medicare data was successfully completed for 95% of patients with breast cancer 65 years or older in SEER records.
The Medicare program funds most inpatient and outpatient services for more than 97% of US residents 65 years or older. We used the Medicare Provider Analysis and Review File, which contains inpatient hospital claims; the Outpatient Standard Analytic File, which contains claims for outpatient facility services; the 100% Carrier Standard Analytic File, which contains claims for physician services; and the denominator file, which contains information on beneficiary entitlement and zip code of residence.
The US census data were linked with the SEER-Medicare database to determine the population density of the county in which breast cancer operations were performed and to determine socioeconomic characteristics of the zip codes in which patients resided.
The institutional review board of the Medical College of Wisconsin approved the study design.
Initial selection criteria included women 66 years or older, who were diagnosed as having invasive stage I or II breast cancer from January 1, 1994, through December 31, 1996, were eligible for Medicare parts A and B and not in a health maintenance organization for at least 1 year before diagnosis and after diagnosis for at least 4 months or until death, and had a Medicare surgeon claim within 4 months after diagnosis. These selection criteria yielded 13 030 women who were followed up through August 31, 2000. Women were excluded if their breast cancer operation was performed outside the SEER catchment areas (n = 611) or if a valid surgeon unique provider identification number was not present on the Medicare claim (n = 203). These criteria resulted in a final study cohort of 12 216 women.
The patient's age at diagnosis was determined from the Medicare files, and race was determined from the SEER files. The patient's zip code of residence was determined from the Medicare files and was linked to US census data to determine the per capita income of the zip code in which the patient resided, to estimate neighborhood socioeconomic status.18,19 Comorbidity was determined from the inpatient and ambulatory Medicare claims for the year before the breast cancer diagnosis using the adaptation of Klabunde et al17 of the Charlson comorbidity index.20
Tumor characteristics were determined from the SEER registry data, including the size and grade of the primary tumor, the status of the axillary lymph nodes, and the hormone receptor status.
For the determination of volume, a broader cohort of patients was used than for the study cohort described in the preceding paragraphs. For determining volume, women were required to be 65 years or older at diagnosis of a breast cancer of any stage during the study years and were required to have an operation identified on a Medicare claim within 1 month before and 4 months after the SEER month of diagnosis.21 For each surgeon and hospital operating on a study cohort patient, the number of operations in this broader cohort was counted.
Date of death was determined from Medicare files. The cause of death was determined from SEER files, derived from a review of death certificates. The cause of death was unknown for 99 subjects, who were uncensored in the overall mortality analyses but censored in the cause-specific analyses.
The mean annual Medicare volume was estimated for each surgeon. The surgeons were categorized into high-, medium-, and low-volume categories that had similar numbers of patients but still required that the high-volume surgeons have at least twice the volume of the low-volume surgeons. This resulted in categories of fewer than 5 annual Medicare cases, 5 to fewer than 10 annual cases, or 10 or more annual cases. To control for selection bias in the choice of high-, medium-, or low-volume surgeon, we performed a propensity score analysis. This is a method of dealing with possible bias in observational studies.22,23 We developed a model to predict the patient's propensity for treatment by a low-, medium-, or high-volume surgeon. We used a trichotomous logistic regression for the propensity model and then formed the propensity groups based on planar tertiles of bivariate predicted probabilities.24 Candidate factors for the propensity model included available demographic and extent-of-disease factors. Factors that were significant predictors of propensity for undergoing treatment by a high-volume surgeon (all at the P ≤ .01 level) were white race, being married, residing in a rural area or a metropolitan statistical area of fewer than 250 000 persons, residing in a zip code with higher per capita income, having less comorbidity, having a known hormone receptor status, and having an unknown axillary lymph node status. The propensity groups markedly reduced differences between the surgeon volume groups with respect to all the factors in Table 1. In most but not all cases, the differences were rendered statistically insignificant within propensity groups.
To evaluate the relationship of surgeon volume and survival, Cox proportional hazards survival analysis was used.25 The patient was the unit of analysis. Clustering of outcomes by surgeon was addressed by including in the model a γ distributed random frailty for each surgeon.26 Covariates in the Cox proportional hazards models included patient age (linear and quadratic components by decade), race (white, black, Asian, Hispanic, or other), comorbidity index score (0, 1, or ≥ 2), SEER site, hospital volume (< 20, 20 to < 40, or ≥ 40 Medicare operations annually),27 zip code level per capita income (national tertiles), and the disease characteristics of tumor size (< 20 mm, 20-50 mm, > 50 mm, or unknown), lymph node status (negative, positive, or unknown), hormone receptor status (estrogen receptor and progesterone receptor negative, estrogen receptor and/or progesterone receptor positive, or unknown), and tumor grade (well or moderately differentiated, poorly differentiated or undifferentiated, or unknown). The propensity score category was also included. Tests of proportionality of hazards for the covariates showed lack of proportionality for several factors. Surgeon volume nonproportionality was resolved by modeling a time-varying effect at 10 months of follow-up or less. Hormone receptor status was resolved by modeling a time-varying effect at 56 months of follow-up or less. Per capita income was resolved by stratification. Lymph node status was resolved by stratification for the unknown category only. The results reported reflect surgeon volume for longer than 10 months of follow-up; for earlier follow-up times, the volume-outcome relationship was in the same direction but larger in magnitude.
The 12 216 women who had undergone surgical treatment of stage I or II breast cancer were operated on by 1856 different surgeons, of whom 1325 were low-volume, 384 were medium-volume, and 147 were high-volume surgeons (Table 1). Patients treated by high-volume surgeons were slightly younger, were more likely to be white, and resided in more affluent areas than those treated by low-volume surgeons. The patients of high-volume surgeons had lower levels of comorbid illness, had tumors that were on average 1 to 2 mm smaller, and were more likely to be treated in high-volume hospitals. The relationship of surgeon and hospital volume was moderate, with a Spearman rank correlation coefficient of 0.37. The point was raised previously that if the extent-of-disease information is missing differentially across volume groups, then inclusion of extent-of-disease factors in the model could bias the results.28 However, differences by volume groups in missing extent-of-disease factors were small and inconsistent in direction (Table 1).
During the 50-month median follow-up, 2753 women (22.5%, or 54.4 per 1000 person-years) died; of these women, 760 (6.2%, or 15.6 per 1000 person-years) died of breast cancer and 1894 (15.5%, or 38.8 per 1000 person-years) died of other causes. For 99 women, no cause of death was available. In unadjusted analyses (Table 2), the difference in mortality rates between patients of high- and low-volume surgeons was 18.7 deaths per 1000 patient-years overall, 4.4 deaths per 1000 patient-years for breast cancer mortality, and 14.3 deaths per 1000 patient-years for other-cause mortality. The survival curve for the patients of medium-volume surgeons was within the boundaries of the high- and low-volume curves throughout the period of observation (Figure).
In the Cox proportional hazards models, adjusting for patient demographic factors, extent of disease, comorbidity, and hospital volume, the relationship of surgeon volume to overall mortality was significant, with a hazard ratio of 0.86 (95% confidence interval [CI], 0.77-0.97) for patients of high-volume surgeons and a hazard ratio of 0.94 (95% CI, 0.85-1.03) for patients of medium-volume surgeons. However, surgeon volume was not significantly related to breast cancer mortality (Table 3). In contrast, surgeon volume was significantly related to non–breast cancer mortality, with a 14% lower hazard of death among patients of high-volume surgeons (Table 3).
The extent-of-disease factors were much more strongly related to breast cancer–specific mortality than to non–breast cancer mortality (Table 3). The level of comorbid illness had a stronger association with non–breast cancer mortality than with breast cancer mortality.
The most common cause of non–breast cancer mortality was cardiovascular diseases, which accounted for the deaths of 961 women (7.9%). The relationship of surgeon volume of breast cancer cases to death from cardiovascular disease was quite significant, with a hazard ratio of 0.79 (95% CI, 0.65-0.96) for patients of high-volume surgeons and 0.89 (95% CI, 0.76-1.05) for patients of medium-volume surgeons.
The purpose of the propensity score analysis was to control as much as possible for selection bias inherent in observational data. As a sensitivity analysis, the models were run separately within each of the propensity groupings.29 The conclusions remained unchanged. The relationship of surgeon volume to breast cancer mortality was small and statistically insignificant within each stratum. The relationship of surgeon volume to other-cause mortality was consistent within each grouping, achieving statistical significance for the medium- and high-volume propensity strata.
In this study of 12 216 women with early-stage breast cancer, those operated on by high-volume surgeons were substantially less likely to die than those treated by low-volume surgeons. This finding is similar to that reported by other investigators.13- 15 The usual interpretation of such a finding is that volume is a marker for quality of care. However, our disease-specific mortality results call that interpretation into question. The patients of high-volume surgeons were not less likely to die of breast cancer. Rather, they were significantly less likely to die of other causes.
This finding is consistent with the possibility of a nonspecific effect, perhaps selection bias, in the types of patients cared for by high-volume surgeons as an explanation for at least some of the observed surgeon volume–mortality relationship. We observed clear differences among the patients cared for by high-, medium-, and low-volume surgeons, particularly in terms of sociodemographic characteristics and level of comorbidity. These factors, and other unmeasured factors, could be confounding the surgeon volume–outcome relationship.
With respect to possible selection by socioeconomic status, the socioeconomic factors were available for this study only at the zip code level. Even with individual-level information, it is difficult, perhaps impossible, to control fully for socioeconomic status.30 While we used current statistical techniques to control for selection bias, it is likely that these methods could not completely control for such bias. In addition, 1 recent study31 found low socioeconomic status to be associated with impaired functional capacity and worse all-cause mortality, which could explain the relationship we observed between breast cancer surgeon volume and cardiovascular mortality.
Other explanations might also exist for these findings. High-volume high-quality surgeons may refer to other high-quality physicians, accounting for the reduction in non–breast cancer mortality. Even if so, however, one would expect the surgeon volume to breast cancer mortality relationship to be at least as strong as the relationship of surgeon volume to other mortality. Another possible explanation is that the breast cancer mortality model has several specific extent-of-disease predictors, while the non–breast cancer mortality model has only a general comorbidity measure. However, although better specification of overall mortality predictors (ie, general comorbidity) might reduce or eliminate the relationship of surgeon volume to other mortality, such specification would be unlikely to strengthen the surgeon volume–breast cancer mortality relationship given that the available comorbidity information shows that the low-volume surgeons care for sicker patients (Table 1). Therefore, if selection bias based on comorbid illness accounts for our findings, better controlling for the factor would likely reduce the association of surgeon volume to overall mortality.
Other studies may indirectly support the possibility that selection bias could be an operative factor explaining at least some of the volume-outcome relationships described in the literature.32,33 One study32 among patients with colon cancer found that patients operated on in high-volume hospitals had better overall survival, but the improved survival could not be attributed to lower cancer recurrence rates. This study raised a question of whether other differences exist in the patients served by high- vs low-volume hospitals. Another study33 found mortality to be related not only to the volume of the same procedure but also to the hospital volume of many other procedures. While this finding could be because of shared process characteristics of large-volume hospitals, systematic differences in the types of patients admitted to high-volume hospitals could also account for it.
With respect to breast cancer, 3 previous studies13- 15 have found a relationship between higher surgeon volume and better overall survival. The survival of patients with breast cancer may be affected by other factors than surgeon quality, such as medical oncology care. In the 2 UK studies13,14 that examined oncologic processes of care (such as receipt of adjuvant therapy), these process measures did not explain the differential in overall survival by surgeon volume. This raises the question of what does account for the better survival of patients of high-volume surgeons. Unmeasured improvements in the care provided by high-volume surgeons might mediate the relationship, but patient selection bias also could account for this finding.
A limitation of this study is that the cause-of-death information, derived from death certificates, may be inaccurate. Problems with the accuracy of cause of death, as determined by death certificate, have been described, including poor sensitivity for some cancers.34 It is conceivable that death certificate accuracy varies by surgeon volume. However, the cancer extent-of-disease factors were much stronger predictors of breast cancer mortality than of other causes of death (Table 3). Similarly, the comorbidity score was much more strongly predictive of overall and non–breast cancer mortality than breast cancer mortality was. These findings provide support for the validity of the coding of cause of death in this study. Also, a lack of sensitivity for breast cancer deaths would be unlikely to explain the association we observed between surgeon volume and non–breast cancer mortality.
Another limitation is that the study included only patients 65 years or older, and the surgeon volume was determined by measuring the number of operations performed on women in this age group. However, about half of all incident breast cancer cases occur in women 65 years or older, and there is no reason to believe case volume would be systematically biased by measuring it in this age group. A number of other investigators3,8 have also used this approach to determining volume. The study of Medicare patients permitted analysis of a large population-based cohort of patients and surgeons, an important strength.
In summary, this study found a significant relationship between surgeon volume and non–breast cancer mortality, but not between surgeon volume and breast cancer mortality. This finding is evidence of possible selection bias, which could be based on socioeconomic status or general comorbidity. It is possible that the beneficial effect of high surgeon volume on breast cancer mortality has been overstated. Future studies should further evaluate this possibility and rigorously consider whether selection bias could account for volume-outcome relationships for other health conditions.
Correspondence: Ann Butler Nattinger, MD, MPH, Department of Medicine, Medical College of Wisconsin, 9200 W Wisconsin Ave, FEOB Ste 4200, Milwaukee, WI 53226 (email@example.com).
Accepted for Publication: April 24, 2007.
Author Contributions:Study concept and design: Nattinger and Laud. Acquisition of data: Nattinger. Analysis and interpretation of data: Nattinger, Laud, Sparapani, Zhang, Neuner, and Gilligan. Drafting of the manuscript: Nattinger and Laud. Critical revision of the manuscript for important intellectual content: Nattinger, Laud, Sparapani, Zhang, Neuner, and Gilligan. Statistical analysis: Laud, Sparapani, and Zhang. Obtained funding: Nattinger. Study supervision: Nattinger and Laud.
Financial Disclosure: None reported.
Funding/Support: This study was supported by grant R01-CA081379 from the National Cancer Institute, Public Health Services (Dr Nattinger), and grant K08-AG021631 from the National Institute on Aging, Public Health Services (Dr Neuner).
Role of the Sponsor: The funding bodies had no role in the design and conduct of the study; in the collection, analysis, and interpretation of the data; or in the preparation, review, or approval of the manuscript.
Additional Contributions: James S. Goodwin, MD, provided helpful comments on an earlier draft of the manuscript; the Applied Research Program, National Cancer Institute, the Office of Research, Development, and Information, Centers for Medicare & Medicaid Services, Information Management Services, Inc, and the SEER Program Tumor registries assisted in the creation of the SEER-Medicare database.