Inclusion and exclusion criteria for practice pattern analysis (A) and patient outcome analysis (B). AMA indicates American Medical Association; CCS, Clinical Classifications Software; ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; MD, doctor of medicine.
eAppendix 1. Diagnosis and Procedure Codes Used in Analysis
eAppendix 2. Complete Matching Algorithm
eAppendix 3. Sensitivity Analysis
eTable 1. Complete Results of Match
eFigure 1. Distribution of Procedures per Surgeon in Practice Pattern Analysis
eFigure 2. Distribution of Procedures per Surgeon in Outcomes Analysis
Customize your JAMA Network experience by selecting one or more topics from the list below.
Sellers MM, Keele LJ, Sharoky CE, Wirtalla C, Bailey EA, Kelz RR. Association of Surgical Practice Patterns and Clinical Outcomes With Surgeon Training in University- or Nonuniversity-Based Residency Program. JAMA Surg. 2018;153(5):418–425. doi:10.1001/jamasurg.2017.5449
Do the practice patterns and clinical outcomes of general surgeons differ by type of residency training?
In this cohort study of 3638 general surgeons trained at nonuniversity- and university-based residency programs, significant differences were noted in types and proportion of procedures performed between the inpatient and outpatient setting, but no statistically significant difference was observed in clinical outcomes among surgeons with similar practice patterns operating within the same hospital.
Surgeons trained in nonuniversity- and university-based residency programs have distinct practice patterns. When compared within the same clinical setting, surgeons from both training backgrounds achieve similar clinical outcomes.
Important metrics of residency program success include the clinical outcomes achieved by trainees after transitioning to practice. Previous studies have shown significant differences in reported training experiences of general surgery residents at nonuniversity-based residency (NUBR) and university-based residency (UBR) programs.
To examine the differences in practice patterns and clinical outcomes between surgeons trained in NUBR and those trained in UBR programs.
Design, Setting, and Participants
This observational cohort study linked the claims data of patients who underwent general surgery procedures in New York, Florida, and Pennsylvania between January 1, 2012, and December 31, 2013, to demographic and training information of surgeons in the American Medical Association Physician Masterfile. Patients who underwent a qualifying procedure were grouped by surgeon. Practice pattern analysis was performed on 3638 surgeons and 1 237 621 patients, representing 214 residency programs. Clinical outcomes analysis was performed on 2301 surgeons and 312 584 patients. Data analysis was conducted from February 1, 2017, to July 31, 2017.
NUBR or UBR training status.
Main Outcomes and Measures
Inpatient mortality, complications, and prolonged length of stay.
No significant differences were observed between the NUBR-trained surgeons and UBR-trained surgeons in age (mean, 53.3 years vs 53.7 years), sex (female, 18.2% vs 16.9%), or years of clinical experience (mean, 16.5 years vs 16.5 years). Overall, NUBR-trained surgeons compared with UBR-trained surgeons performed more procedures (median interquartile range [IQR], 328 [93-661] vs 164 [49-444]; P < .001) and performed a greater proportion of procedures in the outpatient setting (risk difference, 6.5; 95% CI, 6.4 to 6.7; P < .001). Before matching, the mean proportion of patients with documented inpatient mortality was lower for NUBR-trained surgeons than for UBR-trained surgeons (risk difference, −1.01; 95% CI, −1.41 to −0.61; P < .001). The mean proportion of patients with complications (risk difference, −3.17%; 95% CI, −4.21 to −2.13; P < .001) and prolonged length of stay (risk difference, −1.89%; 95% CI, −2.79 to −0.98; P < .001) was also lower for NUBR-trained surgeons. After matching, no significant differences in patient mortality, complications, and prolonged length of stay were found between NUBR- and UBR-trained surgeons.
Conclusions and Relevance
Surgeons trained in NUBR and UBR programs have distinct practice patterns. After controlling for patient, procedure, and hospital factors, no differences were observed in the inpatient outcomes between the 2 groups.
Over the past decade, the increased national focus on safety and quality in medicine has had a substantial influence on graduate medical education. The introduction of the Accreditation Council for Graduate Medical Education Next Accreditation System and the Institute of Medicine’s 2014 call for implementing pay-for-performance methods in graduate medical education funding highlight the need for new methods to define and measure the success of medical training programs. In recent years, researchers have started closely examining the association of provider training with patient outcomes.1-3
Previous studies have found differences in the experiences of trainees between university-based residency (UBR) and nonuniversity-based residency (NUBR) programs (also called hospital-based or independent training programs). A cross-sectional national survey of categorical general surgery residents found that trainees in NUBR programs were more satisfied with their operative experience during training compared with residents in UBR programs. Trainees in NUBR programs also indicated a more supportive clinical learning environment.4 Additional surveys found that graduating chief residents of NUBR programs were more likely to pursue general surgery rather than fellowship training.5,6
Despite these documented differences in training experience, little is known about the association of training setting with future practice patterns or patient outcomes. Our goal in this study was to examine the association of training in NUBR vs UBR programs with practice patterns and outcomes achieved by program graduates. We matched NUBR- to UBR-trained surgeons within the same hospitals to remove hospital influences on outcomes.
We conducted a retrospective observational study using a data set linking all-payer claims from New York,7 Florida,8 and Pennsylvania9 between January 1, 2012, and December 31, 2013, to physician demographic and training information from the American Medical Association (AMA) Physician Masterfile.10 These states were selected as they include identifiers that allow patient claims to be linked to hospital and physician information. This study was deemed exempt from continuing review by the institutional review board of the University of Pennsylvania. Data analysis was conducted from February 1, 2017, to July 31, 2017.
General surgery residency programs were classified as either UBR or NUBR. Assignment was based on review of the sponsoring institution listed on the Accreditation Council for Graduate Medical Education website11 and was supplemented with previously collected data on program affiliations, program size, and geographic region.12 Community-based programs with affiliations to universities were classified as NUBR. Surgeons were classified as NUBR- or UBR-trained on the basis of their residency program listed in the AMA Masterfile. Surgeon age, sex, and year of training completion were also abstracted. Surgeon experience was defined as year of training completion subtracted from year of operation.
Surgeons were excluded if they could not be identified in the AMA Masterfile, did not meet the criteria for general surgery training, did not attend an allopathic program identified as an NUBR or a UBR, did not train within the United States, or did not perform at least 5 operations within the 2-year study time frame. Overall practice patterns of surgeons trained at UBR and NUBR programs were compared using inpatient and outpatient procedures. Analysis of patient outcomes was then performed using a set of inpatient operations, as complications in the outpatient setting are difficult to analyze in administrative data. The Figure details the exclusion criteria for the overall and inpatient cohorts. See eAppendix 1 in the Supplement for a list of all codes used to define the cohort and define complications.
To examine practice patterns, we selected and grouped patients by surgeon if they underwent a general surgery procedure. Procedures representing the scope of general surgery13 were defined using Agency for Healthcare Research and Quality Clinical Classifications Software (CCS) categories, which map both the International Classification of Diseases, Ninth Revision, Clinical Modification procedure codes and the Current Procedural Terminology (CPT) codes.14
We compared the characteristics of NUBR- and UBR-trained surgeons. The contribution of each CCS category to the overall practice of each type of surgeon was calculated, and the proportion of procedures performed in the outpatient setting and inpatient setting overall as well as within each CCS category for both surgeon types was compared over the study time frame.
To examine clinical outcomes, we selected patients using the International Classification of Diseases, Ninth Revision, Clinical Modification codes that mapped to a narrower set of operations typically performed by general surgeons in the inpatient setting. Patients were classified as having undergone 1 of 44 general surgical operations during a nontrauma inpatient admission. Operations were categorized as complex or essential on the basis of the Surgical Council on Resident Education Curriculum Outline.15
Patient sociodemographic and clinical characteristics were abstracted from the claims data. Comorbidities were defined using Elixhauser indices.16 Patients were classified by admission type (emergent, urgent, or elective) and grouped by surgeon. Surgeon volume was calculated as total operations, complex operations, and essential operations.
The primary outcomes were inpatient mortality, postoperative complications, and prolonged length of stay (PLOS). Complications were identified using the International Classification of Diseases, Ninth Revision, Clinical Modification diagnosis codes and collapsed into a binary variable indicating the development of 1 or more complications. Prolonged length of stay was defined as hospital- and operation-specific length of stay greater than the 75th percentile. The PLOS is a well-defined measure used to capture inefficiencies in care and complications that result in prolonged hospitalization.17,18
Means were calculated for all patient-level measures, including outcomes, for each surgeon to provide surgeon-level covariates. Aggregating the patient data to the surgeon level ensured that the statistical precision of our estimates was not underestimated.19
We used matching to compare clinical outcomes of NUBR- and UBR-trained surgeons. Matching was used instead of hierarchical modeling as it allowed us to directly establish that the covariate balance between the exposed group (NUBR) and the control group (UBR) was similar and, at the same time, compare outcomes of NUBR- and UBR-trained surgeons within the same hospital. Exact matching of NUBR- to UBR-trained surgeons within hospitals ensured high levels of comparability between the 2 groups by restricting the analytic sample to surgeons with different residency types but practicing in the same clinical environment. Moreover, an exact match within hospitals controlled for measured and unmeasured hospital factors that may contribute to patient-level outcomes. Surgeons who operated in multiple hospitals were assigned to the hospital in which they performed the most operations. Finally, matching separated the design stage from the outcome analysis stage, decreasing the risk of multiple testing issues.
We used optimal cardinality matching to pair surgeons who were most similar on the basis of more than 100 covariates. Cardinality matching returns the largest matched sample that satisfies set specifications for covariate balance.20 Surgeon-level covariates included sex, age, years of experience, and operative volume (essential, complex, and total). Patient-level covariates (demographics, comorbidities, and procedure types) were calculated at the surgeon level as the mean value for patients cared for by that surgeon or as the percentage of patients cared for by that surgeon with given characteristics. We used fine balance to nearly exact balance years of experience, primary surgical specialty, and secondary surgical specialty.21 See eAppendix 2 in the Supplement for the complete matching algorithm.
A planned sensitivity analysis was performed to assess the magnitude of bias from unmeasured confounders that would need to be present to alter the conclusions.22,23 In the context of the sensitivity analysis, we used secondary board-certification status as a proxy for fellowship training. Secondary board certification was defined as surgeons with both general surgery and any other board certification listed in the AMA Masterfile.
To assess the quality of the matches, we computed the standardized difference for each covariate. The standardized difference is calculated for a given covariate as the mean difference between matched patients divided by the pooled SD before matching.24-26 We allowed for a maximum standardized difference of 0.10, or one-tenth of an SD, in the matched cohorts.24-27
Descriptive comparisons were made using unpaired, 2-tailed t test; 2-sample proportion test; and Wilcoxon rank test. We tested for significant differences in outcomes using the Wilcoxon rank test. All hypothesis tests were 2-sided. We used a significance threshold of a 2-sided P < .05. Statistical analyses were conducted using Stata/IC, version 13.1 (StataCorp LLC) and R software (R Foundation for Statistical Computing). All matching analyses were conducted using the Designmatch package28 in R, version 3.3.2.
We identified 3638 surgeons who operated on 1 237 621 patients with a qualifying inpatient or outpatient procedure between January 1, 2012, and December 31, 2013, in New York, Florida, and Pennsylvania. Of these surgeons, 885 (24.3%) trained in NUBR programs. There was no significant difference between the NUBR-trained surgeons and UBR-trained surgeons in age (mean, 53.3 years vs 53.7 years; risk difference, −0.37 years; 95% CI, −1.12 to 0.38), sex (female, 18.2% vs 16.9%; risk difference, 0.01%; 95% CI, −0.02 to 0.04), or years of clinical experience (mean, 16.5 years vs 16.5 years; risk difference, 0.05 years; 95% CI, −0.75 to 0.85). A total of 214 residency programs were represented in our sample, of which 98 (45.7%) were NUBR. Table 1 displays the geographic and size distributions of training programs.
Overall practice patterns by surgeon training type and breakdown of inpatient compared with outpatient practice by procedure category are shown in Table 2. Surgeons trained in NUBR programs performed a greater proportion of procedures in the outpatient setting (risk difference, 6.5; 95% CI, 6.4-6.7; P < .001). The contribution of each CCS category to the overall practice pattern of NUBR-trained surgeons compared with UBR-trained surgeons was significantly different for 21 of 23 CCS categories. The largest difference was seen in colonoscopy or proctoscopy with biopsy (risk difference, 11.6; 95% CI, 11.4-11.8). On average, NUBR-trained surgeons compared with UBR-trained surgeons performed more procedures per surgeon overall (median [interquartile range (IQR)], 328 [93-661] vs 164 [49-444]; P < .001), in the outpatient setting (median [IQR], 195.5 [41-482] vs 80 [16-296]; P < .001), and in the inpatient setting (median [IQR], 106.5 [38-192.5] vs 75 [22-169.5]; P < .001).
A total of 2301 surgeons and 312 584 patients were identified for analysis of inpatient outcomes. Of these surgeons, 662 (28.7%) were trained at NUBR programs, representing 95 134 patients (30.4%). A total of 209 residency programs were represented, 96 (45.9%) of which were NUBR.
Representative characteristics of NUBR- and UBR-trained surgeons and their associated hospitals and patients are displayed in Table 3. Before matching, several differences in surgeon and patient characteristics were noted, as indicated by a standardized difference greater than 0.10. These differences included number of essential operations performed per surgeon, number of patient comorbidities, proportion of elective admissions, and proportion of commercial insurance. Hospital setting and size characteristics varied between the 2 groups. Differences in distribution of operation type were also seen prior to matching (eTable 1 in the Supplement).
Before matching, all outcomes differed significantly by residency training type (Table 4). The mean proportion of patients who died was lower for NUBR-trained surgeons (risk difference, −1.01%; 95% CI, −1.41 to −0.61; P < .001). The mean proportion of patients who experienced a complication was lower for NUBR-trained surgeons (risk difference, −3.17%; 95% CI, −4.21 to −2.13; P < .001), as was the mean proportion of patients who experienced PLOS (risk difference, −1.89%; 95% CI, −2.79 to −0.98; P < .001).
We then used an optimal 1 to 1 match to identify 494 NUBR-trained to UBR-trained surgeon pairs in 216 hospitals. Each matched pair operated within the same hospital. The match resulted in a population of 70 404 patients operated on by NUBR-trained surgeons and 72 010 patients operated on by UBR-trained surgeons. After matching, all surgeon characteristics were similar (Table 3). Marginal distributions of primary and secondary surgical specialty were nearly identical. Distributions of patient comorbidities and operation types were also highly comparable (eTable 1 of the Supplement).
After matching, clinically equivalent outcomes were seen between surgeons trained in NUBR and UBR programs. No significant difference was observed in the mean proportion of patients who died (risk difference, 0.13%; 95% CI, −0.28 to 0.53; P = .53), had a complication (risk difference, 0.20%; 95% CI, −1.04 to 1.43; P = .76), or experienced PLOS (risk difference, 0.40%; 95% CI, −1.76 to 1.57; P = .50).
Next, we performed a sensitivity analysis to explore whether an unobserved confounder could have masked a stronger association than what we observed in the matched sample.29 We tested whether the estimated complication and PLOS outcomes were statistically significantly different from a difference in rates of 5%. For mortality, we tested whether the proportion was significantly different from a difference in proportions of 2.5%. The sensitivity analysis indicated that the test of equivalence in mortality would remain statistically significant in the presence of a confounder that increased the odds of having surgery performed by an NUBR-trained surgeon by 601%. This finding indicated that our observed null result was not likely a product of unobserved confounding. For the 2 other outcomes, the level of confounding required to mask an association was approximately one-third that of the mortality outcome. (See eAppendix 3 in the Supplement for a detailed explanation of the sensitivity analysis.)
As a proxy for fellowship status, we examined the association between the training type and additional board certification, allowing us to use sensitivity analysis to determine if confounding due to differences in additional training beyond residency would have changed our findings. This test was not significant: The proportion of NUBR-trained surgeons with additional certification was 16.9%, compared with the 18.2% proportion of UBR-trained surgeons (risk difference, 1.3%; 95% CI, −2.1 to 4.7). The difference in additional board certification would correspond to a 10% change in odds of being operated on by an NUBR-trained surgeon, a percentage that is much less than the threshold of the sensitivity analysis.
As the training system continues to adapt to the changing surgical needs of the population, additional tools are necessary to evaluate the quality of residency programs and performance of the overall graduate medical education system. Evaluating surgical training programs using patient outcomes after trainees transition to independent practice is 1 proposed method. Our study examined the association of training in UBR or NUBR programs with overall practice patterns and inpatient outcomes using matching techniques to control for patient and hospital factors.
Previous studies have shown the differences in the experiences reported by trainees in NUBR and UBR programs. Our results extend this work by examining the performance of graduates from each type of training program. We show the differences in the distribution of procedures performed by surgeons trained in either program, with NUBR-trained surgeons performing a higher volume of cases and having a greater proportion of their case mix devoted to outpatient procedures. In addition, a greater proportion of NUBR-trained surgeons’ case mix is devoted to endoscopy.
General surgery residency programs in the United States are currently facing the challenge of producing enough qualified graduates to meet a predicted national shortage of general surgeons,30 a shortage compounded by increasing subspecialization throughout the field.31,32 Due to the increasing complexity of procedures and patients as well as the aging population, the health care system needs both general surgeons and those with specialized areas of focus.33 Our results support the existing literature on the important role that surgeons trained in both NUBR and UBR programs play in our health care system. Both NUBR and UBR programs train surgeons to perform essential and complex operations, supporting the proposal for NUBR programs to serve as a repository of clinical educators skilled in the breadth of general surgery and as a source of future general surgeons.34
Prior to matching, we noted the differences in the types of inpatient procedures performed as well as the practice setting and types of patients treated by NUBR- and UBR-trained surgeons. For example, a greater proportion of NUBR-trained surgeons practiced in small hospitals, not-for-profit rural hospitals, and investor-owned hospitals. Patients treated by NUBR-trained surgeons had fewer comorbidities, and a greater proportion of these patients were white.
After controlling for the differences in practice patterns among NUBR- and UBR-trained surgeons, we found that, when operating in the same clinical environment, both types of surgeons have clinically equivalent patient outcomes across a wide range of procedures. To our knowledge, no other studies have examined the clinical outcomes achieved by NUBR- and UBR-trained surgeons after transitioning to independent practice. The outcome rates observed in this study for both groups are consistent with those reported in other studies examining clinical outcomes after transition to practice.2
By adjusting for observed confounders using matching rather than regression, we were able to better isolate the consequences of training setting. These results confirm the effectiveness of a surgical training system built within different practice settings. Based on the study findings, both pathways result in surgeons who can perform essential and complex operations on a variety of patient types while achieving similar clinical results.
Our study has some limitations, including its retrospective nature and the inherent limitations associated with an observational study. The practice pattern analysis of NUBR- and UBR-trained surgeons reflects the group of individual surgeons and does not account for possible clustering. It is possible we failed to find a statistically significant difference in outcomes due to the study sample size. However, given the small magnitude of the differences observed in the matched outcomes across all measures, the clinical significance of the study findings remains relevant. Like in other observational studies using matching, the generalizability of our findings may not extend to other settings. By limiting our analysis of inpatient outcomes to paired surgeons practicing within the same hospital, we excluded surgeons practicing in hospitals with only 1 surgeon or surgeons from only 1 type of training background. This study was not intended to assess outcomes at the individual resident or program level but rather to evaluate overall success of the training system in producing surgeons who deliver high-quality care. We were only able to examine the clinical results of surgeons who completed training and transitioned to independent practice. Surgeons were excluded if their training type could not be determined. Fellowship data were not available, but our analysis of additional board certification indicated that the different rates observed between the 2 groups were not likely to mask a true difference in outcomes. Finally, our statistical methods only controlled for observed covariates. However, the overall sensitivity analysis indicated that our results were unlikely to be caused by an unobserved confounder.
Surgeons trained in NUBR and UBR programs play distinct roles in the delivery of surgical care and when operating in the same hospitals achieve similar clinical outcomes. This finding is encouraging in the face of ongoing discussions regarding the best training system and setting to produce highly competent, independent general surgeons to meet the evolving surgical needs of the population.
Accepted for Publication: September 27, 2017.
Corresponding Author: Rachel R. Kelz, MD, MSCE, Center for Surgery and Healthcare Economics, Department of Surgery, Hospital of the University of Pennsylvania, 3400 Spruce St, 4 Maloney, Philadelphia, PA 19104 (firstname.lastname@example.org).
Published Online: January 10, 2018. doi:10.1001/jamasurg.2017.5449
Author Contributions: Mr Wirtalla and Dr Keele had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Sellers, Keele, Sharoky, Bailey, Kelz.
Acquisition, analysis, or interpretation of data: Sellers, Keele, Sharoky, Wirtalla, Kelz.
Drafting of the manuscript: Sellers, Keele, Sharoky, Kelz.
Critical revision of the manuscript for important intellectual content: Keele, Sharoky, Wirtalla, Bailey, Kelz.
Statistical analysis: Keele.
Administrative, technical, or material support: Wirtalla.
Study supervision: Kelz.
Conflict of Interest Disclosures: Dr Kelz reported receiving funding from the National Institute on Aging.
Funding/Support: The data set used for this study was purchased with a grant from the Society of American Gastrointestinal and Endoscopic Surgeons. Data for Pennsylvania were provided by the Pennsylvania Health Care Cost Containment Council, an independent state agency responsible for addressing the problems of escalating health costs, ensuring the quality of health care, and increasing access to health care for all citizens. Florida data were derived from a limited data set provided by the Florida Center for Health Information and Transparency.
Role of the Funder/Sponsor: The funding source had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. However, the Pennsylvania Health Care Cost Containment Council was given the opportunity to review the manuscript and approve the decision to submit it for publication.
Disclaimer: The content of this manuscript is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The American Medical Association (AMA) Physician Masterfile was the source of the raw physician data used in this study, but the tables and tabulations were prepared by the authors and do not reflect the work of the AMA.
Create a personal account or sign in to: