Jagsi R, Weinstein DF, Shapiro J, Kitch BT, Dorer D, Weissman JS. The Accreditation Council for Graduate Medical Education's Limits on Residents' Work Hours and Patient SafetyA Study of Resident Experiences and Perceptions Before and After Hours Reductions. Arch Intern Med. 2008;168(5):493-500. doi:10.1001/archinternmed.2007.129
Copyright 2008 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2008
Limiting resident work hours may improve patient safety, but unintended adverse effects are also possible. We sought to assess the impact of Accreditation Council for Graduate Medical Education resident work hour limits implemented on July 1, 2003, on resident experiences and perceptions regarding patient safety.
All trainees in 76 accredited programs at 2 teaching hospitals were surveyed in 2003 (preimplementation) and 2004 (postimplementation) regarding their work hours and patient load; perceived relation of work hours, patient load, and fatigue to patient safety; and experiences with adverse events and medical errors. Based on reported weekly duty hours, 13 programs experiencing substantial hours reductions were classified into a “reduced-hours” group. Change scores in outcome measures before and after policy implementation in the reduced-hours programs were compared with those in “other programs” to control for temporal trends, using 2-way analysis of variance with interaction.
A total of 1770 responses were obtained (response rate, 60.0%). Analysis was restricted to 1498 responses from respondents in clinical years of training. Residents in the reduced-hours group reported significant reductions in mean weekly duty hours (from 76.6 to 68.0 hours, P < .001), and the percentage working more than 80 hours per week decreased from 44.0% to 16.6% (P < .001). No significant increases in patient load while on call (patients admitted, covered, or cross covered) were observed. Between 2003 and 2004, there was a decrease in the proportion of residents in the reduced-hours programs indicating that working too many hours (63.2% vs 44.0%; P < .001) or cross covering too many patients (65.9% vs 46.9%; P = .001) contributed to mistakes in patient care. There were no significant reductions in these 2 measures in the other group, and the differences in differences were significant (P = .03 and P = .02, respectively). The number of residents in reduced-hours programs who reported committing at least 1 medical error within the past week remained high in both study years (32.9% in 2003 and 26.3% in 2004, P = .27).
It is possible to reduce residents' hours without increasing patient load. Doing so may reduce the extent to which fatigue affects patient safety as perceived by these frontline providers.
For most of the 20th century, physicians in training worked long hours in intensive residency and fellowship programs designed to provide hands-on education through direct patient care.1- 3 In recent years, however, the public, policy makers, and researchers alike have grown increasingly concerned that this system might endanger patients by exposing them to exhausted physicians prone to committing errors.
Partly in response to these concerns, the Accreditation Council for Graduate Medical Education (ACGME) mandated limits on resident work hours applying to all specialties, effective July 1, 2003, including an 80-hour average limit on weekly duty hours and a 30-hour limit on continuous shifts.4 The effects of the implementation of this policy have yet to be fully evaluated, and concerns have been raised that reduction in continuity of care and/or increases in patient load covered by individual residents might offset any benefit from reducing resident fatigue.5 Indeed, while studies restricted to isolated settings, such as intensive care units,6 have suggested that reducing resident hours may result in improved patient safety, there continues to be inadequate information about the more generalized effects of policy implementation on patient safety in teaching hospitals.
Previous studies7,8 have suggested that residents commonly report experience with medical errors. We conducted a survey of residents before and after implementation of the new work hour limits in order to examine changes in experience with adverse events, medical errors, and causation that might be attributable to policy implementation. By using a before and after design, we were able to assess whether there was any substantial change in residents' self-reported experiences and perceptions between 2003 and 2004 in the programs that implemented a substantial policy-related reduction in residents' hours. As an added check, we performed an internal comparison to a concurrent control group—programs that did not reduce hours—to minimize the possibility of detecting changes unrelated to policy implementation.
The study population included all residents and fellows (hereafter referred to as “residents”) in the 76 ACGME-accredited training programs sponsored in both 2003 and 2004 by Massachusetts General Hospital and Brigham and Women's Hospital, Boston. There were 1440 potential respondents in 2003 and 1510 in 2004.
Details of survey design and pretesting have been reported previously.9 The questionnaire queried residents' work hours and patient load, the perceived effect on patient safety, and direct reports of adverse events and mistakes. Questions on work hours and patient load were modeled on previous instruments.10- 12 Residents were asked to estimate their total duty hours in the hospital, longest continuous shift, and hours of sleep in the past week. Residents who had been on call in the hospital in the past 4 weeks were asked to describe their patient load on their most recent on-call night.
Two series of questions addressed residents' perceptions of the impact of residents' fatigue, work hours, and patient load on patient safety. Residents were asked, “On the clinical services that are part of your program, to what extent do the following contribute to mistakes in patient care? Residents working too many hours? Poor handoffs by residents? Carrying or admitting too many patients? Inadequate supervision? Cross covering too many patients?” The questionnaire also solicited residents' perceptions regarding how often fatigue had a negative impact on the safety of patients they cared for and on the quality of care they provided.
To assess experiences relevant to patient safety, we asked residents to report on their own direct experiences with adverse events and medical errors (mistakes leading to either near misses or adverse events).13,14 To minimize recall bias, we asked residents to report on the number of incidents during the past week of clinical practice, corresponding to the same period of “exposure” as the work hours questions. Reports of medical errors (mistakes leading to an adverse event and near-miss mistakes) in the past week were limited to mistakes for which the reporting resident “felt at least partly responsible.” Further details, including the definitions accompanying this battery of questions, have been reported elsewhere.15
Demographic information, such as sex and race, specialty, postgraduate year, and primary activity during the training year (clinical training vs research), was also obtained. The remainder of the instrument focused on educational end points, which have been reported elsewhere.9
The survey was administered in May and June of 2003 and in May and June of 2004 to assess the effects of the intervening policy change without the potential confounding effects of seasonal variation.
A confidential, voluntary, paper survey was distributed to potential subjects at scheduled gatherings, such as teaching conferences. The survey was later circulated by e-mail to potential participants. We provided incentives for completing the survey, including drawings for cash and other prizes, as have been used in previous resident surveys to good effect.16,17
Data from both surveys were entered by a professional data coding service. Statistical analysis was performed using R computer software.18
We hypothesized that any impact of policy implementation would be evident in the subset of programs that implemented substantial hours reductions in the 2003-2004 academic year but not in the remaining programs. Therefore, we compared the change in responses of residents in the programs that experienced a significant reduction in duty hours between the 2 years (the “reduced-hours” group) with those in programs that did not reduce hours substantially, because of either prior compliance (or near compliance) or failure to implement the standards (the “other” group). This design was intended to minimize the likelihood of detecting effects from non–policy-related temporal trends that would have affected all programs.
Programs were classified as reduced-hours if mean weekly duty hours in 2003 exceeded 65 hours and decreased by at least 5 hours between 2003 and 2004. This definition was selected because prior analyses demonstrated that this definition results in program groupings that are sensitive to potential policy-related differences.9
Analyses were performed using 2-way analysis of variance with interaction. The unit of analysis was the individual survey response. Separate analyses of variance were constructed for each dependent variable, corresponding to individual questions. The impact of the policy change was indicated by the significance of the interaction between year and program group. The interaction term represents a difference in differences (ie, the change from 2003 to 2004 in the reduced-hours programs vs the change from 2003 to 2004 in the other programs). Confidence intervals for differences were determined by the method presented by Fleiss.19
A total of 1770 responses were obtained (821 in 2003 and 949 in 2004; response rate, 60.0% of 2950). For trainees within the medical specialties, the response rate was 55%; within surgical specialties, 67%; and within hospital-based specialties, 63%. The response rate was 58% for trainees in programs in the reduced-hours group and 61% in the other group. Of the responses, 72% were obtained at in-person survey administrations. Analysis was restricted to 689 respondents in 2003 and 809 respondents in 2004 who reported being in a primarily clinical year of training.
Demographic characteristics of the study population are reported in Table 1. Of the 76 programs, 13 (representing 390 clinical responses: 195 in 2003 and 195 in 2004) were classified into the reduced-hours group for analysis. This group included primary care and predominantly consultative specialties as well as procedure-based and nonprocedural specialties. Within the reduced-hours group, 256 (65.6%) of the clinical responses were from medical specialties, 131 (33.6%) were from surgical specialties, and 3 (0.8%) were from hospital-based specialties. The other group was constituted by the 488 clinical respondents in 2003 and the 607 in 2004 who reported training in a program other than the 13 programs classified as having made substantial reductions in hours (this excludes 6 clinical respondents from 2003 and 7 clinical respondents from 2004 who provided inadequate information regarding their specialty to be placed in either the reduced-hours or other group). Within the other group, 476 (43.5%) were from medical specialties, 181 (16.5%) were from surgical specialties, and 437 (39.9%) were from hospital-based specialties (percentages do not total 100 because of rounding).
As shown in Table 2, mean duty hours and percentage of residents working in excess of the 80-hour limit decreased in the reduced-hours group from 2003 to 2004, and these reductions were significantly greater than the much smaller reductions observed in the other group (P < .01), as expected based on the definitions used to divide respondents into these groups. Reductions in the percentage working 30-hour shifts or longer in the previous week and in the duration of the longest continuous shift were also greater in the group that reduced hours than in the other group, with these differences in differences approaching significance (P = .06 and P = .07, respectively).
Time reported spent in direct patient care decreased in the reduced-hours group, and this reduction was significant when compared with the relative stability of hours spent in direct patient care reported by the other group. No significant increases in patient load (patients admitted, covered, or cross covered) were observed in the reduced-hours programs. The mean number of patients cross covered on the last call night reported by residents in the reduced-hours group increased only slightly from 2003 to 2004, but this difference was nearly significant when compared with the decrease in patients cross covered in the other group.
Table 3 presents the proportions of residents who believed that various potential factors contributed to mistakes in patient care on the clinical services constituting their programs. From 2003 to 2004, a significantly lower proportion of residents in the reduced-hours programs believed that various work situations contributed to mistakes in patient care “to some extent” or “to a great extent,” including “working too many hours” (P < .001), “carrying or admitting too many patients” (P = .001), “inadequate supervision” (P = .03), and “cross covering too many patients” (P = .001). Although some reductions occurred in the other programs, none of the differences were significant. Furthermore, the 2003-2004 difference was greater for the residents in the reduced-hours programs compared with the difference for residents in the other programs for 2 factors: working too many hours and cross covering too many patients.
Many respondents in the reduced-hours programs and other programs believed that poor handoffs contributed to mistakes in each year, but there were no significant differences between 2003 and 2004.
On a separate battery of questions asking respondents to assess the impact of fatigue, from 2003 to 2004, a lower percentage of respondents from reduced-hours programs reported that fatigue frequently or always affected the quality of care they provided (14.6% vs 9.2%), and this difference was significant (P = .004) when compared with the relative stability in the proportion of respondents in other programs (6.5% vs 6.1%) reporting this. Similarly, from 2003 to 2004, a lower percentage of respondents from reduced-hours programs reported that fatigue frequently or always impacted the safety of patients they cared for (7.0% vs 2.9%), again with a significant difference in differences (P = .03) when compared with the stable proportion of respondents in other programs reporting this (3.7% vs 3.7%).
As shown in Table 4, the percentage of respondents in reduced-hours programs who reported having experienced at least 1 adverse event in the past week was similar from 2003 to 2004 (P = .92). This difference was not significantly different (P = .75) from the also minimal difference from 2003 to 2004 in the other programs (P = .65).
The percentage of respondents in reduced-hours programs reporting having committed at least 1 medical error was 32.9% in 2003 and 26.3% in 2004 (P = .27). The proportion reporting an adverse event in the past week caused by a mistake for which they felt at least partly responsible was 8.7% in 2003 and 5.0% in 2004 (P = .24), and the proportion reporting experiencing a near-miss event in the past week caused by a mistake for which they felt at least partly responsible was 26.7% in 2003 and 20.1% in 2004 (P = .20). In the other programs, the proportions reporting medical errors were nearly identical between the 2 years, as detailed in Table 4. None of the differences in differences achieved significance.
This study was inadequately powered to detect small reductions in the proportions reporting errors, with significance at the P < .05 level. The power achieved by the sample sizes in this study to detect a true decrease from 32.9% to 26.3% with confidence at the P < .05 level was only 19%. For this study to have had 80% power to detect a true decrease from 32.9% to 26.3%, there would have to have been 800 individuals in the reduced-hours group in each year.
Improving patient safety was one primary motivation for the recent ACGME policy limiting resident work hours in all medical specialties. Regulation of resident work hours continues to provoke debate, and analysis of the impact of the ACGME policy is important to guide future policy.
Numerous studies have attempted to assess the impact of resident work-hour reductions on patient outcomes,20- 24 but the net effect remains unclear.25 One landmark prospective study6 found a significantly higher rate of serious medical errors when interns worked traditional shifts of 24 hours or longer than when they worked shorter shifts. Still, the resource-intensive nature of this sort of study, which used direct observation to assess outcomes, necessitated a relatively narrow focus on 2 intensive care units. Critics have noted that to accommodate the reduced-hour shift schedules that were found superior, more residents would have to be assigned to cover an intensive care unit than before the intervention, potentially leaving other services more thinly covered.26 This raises the possibility that interventions reducing hours may improve patient safety within isolated settings but may also result in negative effects on patient safety elsewhere in the same hospital.
Retrospective studies of patient outcomes before and after the implementation of the ACGME policy also have been illuminating. One recent study27 documented improvement in several outcomes among patients discharged from resident-staffed internal medicine services after July 2003, relative to changes observed on nonteaching services at the same institution. Another study28 of inpatient mortality at hospitals nationwide found lower rates after July 2003 for medical patients in teaching hospitals, relative to nonteaching hospitals. Together, these studies are suggestive of a positive impact of the work-hours reductions, but further study is necessary, especially to illuminate whether the changes observed were truly mediated by a reduction in resident fatigue and whether they persist across the full range of specialties.
Physician self-report is a useful tool for evaluating the incidence and causes of adverse events and medical errors.7,15,29- 31 Survey studies may provide complementary information to medical record review and observational studies, particularly regarding causation.15 Other survey studies32- 39 performed since the implementation of the ACGME policy have tended to focus on single specialties and retrospective assessment of perceptions of change, rather than the prospective assessment accomplished herein. This study examined multiple specialties and is further strengthened through the identification of a concurrent control group of training programs, located in the same institutions but not affected by the policy, because of either prior compliance or failure to implement the standards. This design increases the likelihood that the differences detected are truly policy related, rather than because of secular temporal trends that affected all programs, such as increasing financial constraints, reduced lengths of hospital stays, and other broad changes affecting the quality of care in teaching hospitals. Nevertheless, it remains possible that the changes in outcomes observed from 2003 to 2004 were actually caused by other safety-related changes occurring in those programs that implemented a substantial reduction in hours and were not directly related to the work-hours reductions themselves.
In this study, we found that a substantial reduction in the working hours of residents and in the percentage of residents working excessive hours was accomplished between 2003 and 2004 within the reduced-hours group of programs. Moreover, we found that this reduction was accomplished with little difference in the reported patient load or volume of call-night activities. The institutions studied herein were cognizant of concerns that reducing resident work hours by intensifying their workloads might have unintended adverse effects on patient safety. Therefore, they, like several other institutions nationwide,40 not only redesigned schedules but also provided substantial funding to support the hiring of additional residents, attending physicians, and physician extenders. The findings of this study suggest that this strategy can be successful in accomplishing a reduction in residents' hours without a concomitant increase in their patient load.
A noteworthy finding of this study is the significant and substantial reduction in the proportion of respondents in reduced-hours programs who believed that working too many hours contributed to mistakes in patient care on the clinical services of their program. This reduction was not documented simply by asking residents after the policy was implemented whether they believed the policy had made an impact (which would be particularly sensitive to biased self-report). Residents were blinded as to the study hypothesis and to the grouping of programs. Thus, the reduction was documented by comparing the proportion feeling that working too many hours contributed to mistakes in patient care before the implementation of hours reductions to the proportion feeling this way after hours were reduced. The fact that the difference was largely isolated to the programs in which substantial reductions were made (rather than a generalized perception among residents from all programs) suggests that this was a true policy-related change.
In addition, we found significant reductions in the proportion of respondents in reduced-hours programs who believed that other potential problems—inadequate supervision, having too many patients, or cross coverage of too many patients—contributed to mistakes in the year after hours were reduced. The difference-in-differences analysis suggests that the effects on the items regarding work hours and cross coverage were not generalized temporal trends in the hospital but rather changes that were more pronounced in the programs that reduced hours. As previously noted, based on previous research and conjecture, one might have expected the programs that reduced hours to experience greater problems in patient safety after the reductions were implemented, because of increased patient loads and cross coverage.41,42 Thus, the residents in these programs may have expected to experience an increase in patient load (and particularly cross coverage) that was avoided by hiring additional personnel. When patient loads actually remained stable after policy implementation, the residents may have become more sanguine in their assessment of the degree to which cross coverage was a factor contributing to mistakes in patient care (an expectation effect). Alternatively, the finding of generalized improvements on several items in the battery of questions assessing factors contributing to mistakes in patient care might suggest that those programs that reduced hours were also engaged in other reassessments of the safety of patients under their care precipitated by the work-hour reforms or occurring for other reasons, in a way that led to improved patient safety culture and practices in those programs.
Also noteworthy is the fact that many residents in both reduced-hours and other programs reported that poor handoffs contributed to mistakes in patient care in each year. The finding that most respondents in all programs believed that poor handoffs contribute to mistakes in patient care supports further investigation of mechanisms by which to improve handoffs in teaching hospitals. This is particularly important because many strategies for reducing residents' hours increase the number of such handoffs occurring between physicians.43
Finally, the study demonstrates that many residents—more than a quarter—report committing at least 1 error in patient care within a week. This is consistent with other studies,7 including a recent survey study8 that found that 14.7% of residents surveyed reported making a major medical error in the previous 3 months. The higher rate of errors observed in our study is likely related to the fact that our survey asked about all errors (rather than simply major errors) and provided detailed descriptions within the questionnaire. Given the extensive mortality, morbidity, and expense that results from medical errors, the finding that errors are commonly made by residents supports further investigation of systems-level interventions targeted at reducing resident errors. Given the data gathered herein and in the other studies previously discussed, policies to reduce residents' hours may be useful toward this end. Unfortunately, this study lacked adequate power to detect a policy-related decrease in errors, despite its interesting findings about perceptions of error causation, and this is its primary limitation.
Other limitations include the fact that multiple tests were performed, such that P = .05 may not be the appropriate level for determining significance. Still, most of the interesting differences observed were significant at even a more conservative cutoff for significance. Also, while others have validated residents' self-report of work hours10 and residents' self-identification of adverse events, responses to the specific questions used in this survey were not validated with direct observation or medical record review. The limitations of such validation techniques have been discussed elsewhere, however,15 and the questions used in this survey were accompanied by definitions that were validated through intensive cognitive pretesting. However, it is clear that self-report requires some degree of insight on the part of the physician, and so while self-report may detect certain errors (especially near-miss errors) that medical record review might miss, self-report may also fail to detect other errors that would be apparent to an impartial observer.
In addition, we should address the potential concern that individuals who responded to the survey in both years might have been able (consciously) to manipulate the results by skewing their responses. This seems extremely unlikely. First, only 355 individuals responded in both years. Second, given the many items and response categories on the questionnaire, it seems unlikely that these individuals would have been able to remember their responses from the previous year in sufficient detail to allow conscious skewing of their subsequent responses, even if they were so inclined. Nevertheless, because some residents participated in both years, the samples from the 2 years are not strictly independent. If an individual is more likely to have consistent responses on successive surveys, the most likely effect of our having used statistical tests that treat the 2 samples as independent would be to bias the study against finding significant differences between the years.
Finally, the generalizability of this study may be limited by its setting in 2 large tertiary care university-affiliated hospitals. These hospitals invested considerable resources ($3-$4 million in annual support) toward implementing the duty hours requirements. Much research on regulations of resident work hours has originated from the institutions in this study, and program directors were likely keenly aware of potential unintended adverse effects on continuity of care and resident supervision, so they may have been particularly well situated to minimize these potential negative effects. Therefore, the findings of this study may not be generalizable to teaching hospitals with more limited resources. Future research should assess the impact of work hours' regulations in smaller community-based hospitals, to explore the ways in which policy implementation has affected patient safety in those settings.
In conclusion, we believe that this study provides important information for assessing the impact of limiting resident work hours on patient safety across the full range of medical specialties at 2 large teaching hospitals. The study uses self-report from frontline providers, and the before-after design serves to minimize the impact of potential biases associated with data obtained at only 1 (postpolicy) time point. The results indicate that reductions in resident work hours may be accomplished with beneficial effects on patient safety as perceived by these frontline providers and may even be associated with more generalized improvements that result from not only reductions in resident fatigue but also greater attention to safety within the patient care delivery model.
Correspondence: Reshma Jagsi, MD, DPhil, Department of Radiation Oncology, University of Michigan, UHB2C490, SPC 5010, 1500 E Medical Center Dr, Ann Arbor, MI 48109-5010 (email@example.com).
Accepted for Publication: September 24, 2007.
Author Contributions: Drs Jagsi and Weissman had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Jagsi, Weinstein, Shapiro, Kitch, and Weissman. Acquisition of data: Jagsi, Weinstein, Shapiro, and Weissman. Analysis and interpretation of data: Jagsi, Weinstein, Kitch, Dorer, and Weissman. Drafting of the manuscript: Jagsi, Shapiro, Dorer, and Weissman. Critical revision of the manuscript for important intellectual content: Jagsi, Weinstein, Kitch, Dorer, and Weissman. Statistical analysis: Dorer. Obtained funding: Weinstein and Weissman. Administrative, technical, and material support: Weinstein, Shapiro, and Weissman.
Financial Disclosure: None reported.
Funding/Support: This study was supported by an anonymous donor (Partners HealthCare System), and by the Leape Foundation.
Role of the Sponsor: The funding bodies had no role in the design and conduct of the study; in the collection, analysis, and interpretation of the data; or in the preparation, review, or approval of the manuscript.
Additional Information: The survey questionnaire used in this study is available on request from Dr Jagsi.
Additional Contributions: Laura Schroeder and Georgi Bland assisted in data collection; Sage, Inc, assisted in data entry and statistical programming; and Sowmya R. Rao assisted in statistical analysis.