Minimally Invasive Surgical Trainer Virtual Reality (Mentice AB, Gothenburg, Sweden) screen shots for each of the 6 tasks performed by study 1 (novice) subjects. Study 2 (expert) subjects received training for task 6 (manipulate and diathermy [Manip diathermy]), which is the most complex and difficult of these, and were similarly assessed using this task.
The mean (SD) performance scores for the novice laparoscopists (study 1) who had consumed alcohol or who had not (controls) for the time taken (A), error scores (B), and economy of diathermy (mean burn time divided by optimal burn time) (C) at baseline assessment and the day after when they were assessed at 9 AM, 1 PM, and 4 PM.
The mean (SD) performance scores for time taken (A), error scores (B), and economy of diathermy (mean burn time divided by optimal burn time) (C) by the expert laparoscopic surgeons (study 2) at baseline assessment and at 9 AM, 1 PM, and 4 PM the day after they had consumed alcohol to excess.
Customize your JAMA Network experience by selecting one or more topics from the list below.
Gallagher AG, Boyle E, Toner P, et al. Persistent Next-Day Effects of Excessive Alcohol Consumption on Laparoscopic Surgical Performance. Arch Surg. 2011;146(4):419–426. doi:10.1001/archsurg.2011.67
To examine the effect of previous-day excessive alcohol consumption on laparoscopic surgical performance.
Study 1 was a randomized controlled trial. Study 2 was a cohort study.
Surgical skills laboratory.
Sixteen science students (laparoscopic novices) participated in study 1. Eight laparoscopic experts participated in study 2.
All participants were trained on the Minimally Invasive Surgical Trainer Virtual Reality (MIST-VR). The participants in study 1 were randomized to either abstain from alcohol or consume alcohol until intoxicated. All study 2 subjects freely consumed alcohol until intoxicated. Subjects were assessed the following day at 9 AM, 1 PM, and 4 PM on MIST-VR tasks.
Main Outcome Measures
Assessment measures included time, economy of diathermy use, and error scores.
In study 1, both groups performed similarly at baseline, but the alcohol group showed deterioration on all performance measures after alcohol consumption. Overall, although the time score differences between the 2 groups were not statistically significant (P = .29), there was a significant difference between the 2 groups' diathermy (P < .03) and error (P < .003) scores. There was also a significant effect for time of testing (P < .003), diathermy (P < .001), and errors (P < .001). In study 2, experts demonstrated a similar postalcohol performance decrement for time (P < .02), diathermy (P < .001), and error scores (P < .001).
Excessive consumption of alcohol appeared to degrade surgical performance the following day even at 4 PM, suggesting the need to define recommendations regarding alcohol consumption the night before assuming clinical surgical responsibilities.
In high-skills disciplines, such as aviation, the consumption of alcohol is regulated routinely to achieve a desired level of safety at the time the skill is exercised. While there is zero tolerance for obvious alcohol-induced impairment in the workplace, it is known that individuals can underestimate the amount of alcohol they have consumed and the impairment in performance that alcohol may cause.1,2 This tendency to underestimate may be especially true of residual impairment after presumed recovery from the acute effects of alcohol intake.
While surgical performance is certain to be impaired acutely with excessive alcohol consumption, there is little information that defines the persistence of this effect. This may be even more relevant to real-world safety concerns than the acutely impaired surgeon, since societal norms forbid drinking in the workplace but are permissive of alcohol consumption during evenings prior to a workday. Dorafshar et al3 reported that surgical performance was impaired immediately after moderate alcohol consumption but that this impairment was not observed the morning after. Another study examined the effect of a “night out on the town” on surgical performance and found acute performance impairment but also partial persistence of impairment the following morning.4 However, the effect was attenuated and there was no further testing carried out later in the day.
The potential for both early and late alcohol-related performance problems to emerge during laparoscopic surgery is of particular concern given the intense demands it makes on cognitive, perceptual, and visuospatial abilities and the known vulnerability of these human factors to the effects of alcohol. The aim of the 2 studies reported herein was to quantify in a naturalistic, true-life setting the magnitude of alcohol-related impairment of laparoscopic surgical performance over the course of a day following excessive alcohol consumption the previous evening.
Sixteen male final-year students at Queen's University, Belfast, Northern Ireland, with no previous experience with either clinical surgery or the Minimally Invasive Surgical Trainer Virtual Reality (MIST-VR) simulator (Mentice AB, Gothenburg, Sweden) participated in study 1. Institutional ethical approval was obtained for the study, after which subjects were randomized into an alcohol group (n = 8) and a control group (n = 8).
Six experienced laparoscopic surgeons (>250 laparoscopic procedures performed) and 2 expert MIST-VR users at the Yale University School of Medicine, New Haven, Connecticut, participated in study 2.
The MIST-VR system has been previously described by Wilson et al.5 It offers 6 tasks of incrementally greater complexity (Figure 1) with 3 difficulty levels. Each one is based on a key surgical technique used in laparoscopic cholecystectomy. Performance metrics are generated for each task.
All subjects underwent preassessment training. In study 1, novices completed all 6 MIST-VR tasks on the “medium” difficulty setting, 3 times for each hand on 3 separate occasions, which has previously been shown to be a sufficient duration for novice learning curves to plateau6 and it is therefore reasonable to assume that proficiency would have been attained after this training schedule. The expert subjects in study 2 completed 10 trials on the MIST-VR “manipulate and diathermy” task (task 6) set at a custom “difficult” configuration previously developed to establish proficiency criteria in a randomized controlled trial.7
All subjects in both studies completed a baseline trial on the MIST-VR before consuming any alcohol. Subjects in study 1 who were randomized to the alcohol consumption group and the experts in study 2 were then scheduled for a supervised alcohol exposure event in the evening. All subjects ate a restaurant dinner during the early phase of alcohol consumption. None of the subjects were required to drink a set amount, but all subjects were instructed to drink alcohol freely until they felt intoxicated. This was done to imitate as closely as possible real-life conditions. One or more of the investigators were present for all events. (One unit of alcohol was assigned the equivalency of a shot of whiskey, a half pint of beer, or a glass of wine.) Control subjects in study 1 had the dinner component of the “night out” but did not consume any alcohol.
All subjects in both studies were transported to their home before midnight. Surgeons who participated in the study were not on duty that night or the following day. All subjects reported an uninterrupted night's sleep.
Subjects were transported to the study laboratory at 8 AM the following morning. For study 2, a breathalyzer was used to measure blood alcohol levels immediately prior to assessment at 9 AM. Novice subjects were tested on 1 complete trial of the 6 MIST-VR tasks performed during the training phase and expert subjects, on 2 trials of the task performed during training. In both studies, performance results are reported as the mean of the trials performed. All subjects were tested at 9 AM, 1 PM, and 4 PM.
Primary outcome measures were the data recorded by the MIST-VR during all assessments and consisted of time to task completion (seconds), mean errors (number of errors committed per task segment, where an error is defined as a movement away from the target, scored for each hand separately), and efficiency of diathermy use (mean burn time divided by optimal burn time).
Data from both studies were analyzed with SPSS version 15.0.8 In study 1, comparisons between the alcohol consumption and control groups' performance at baseline and different times of the test day were made with 2-factor analysis of variance (ANOVA) for repeated measures. In study 2, comparisons of simulation performance data between the baseline and postalcohol consumption test day assessments were conducted with 1-factor ANOVA for repeated measures. In both studies, within- and between-subject comparisons were made by Scheffé F tests and statistical significance was set at P < .05.
In study 1, there were no significant differences between the groups in mean age, weight, or other demographics (Table 1). Two subjects in each group wore glasses. All subjects consumed alcohol socially on a regular basis. The mean amount of alcohol ingested was 16.5 units (SD = 5.21 units; range, 10-26 units) per subject.
In study 2, all subjects consumed alcohol on an intermittent social basis but not every day. The quantity of alcohol ingested during the study was not recorded but subjects were observed by 1 or more of the investigators who confirmed signs of intoxication. The following morning, 7 experts had alcohol levels undetectable by the breathalyzer, but 1 subject had a level in excess of the legal limit for driving (1.0 mg/mL at the time of the study).
The control group showed no change in their performance from baseline assessment through the 3 test sessions (Table 2). There were no statistically significant differences between the alcohol consumption and control groups for baseline performance on all 3 measures (ie, time [Figure 2A], errors [Figure 2B], and economy of diathermy [Figure 2C]). However, there were large differences observed between the groups on the test day. The alcohol consumption group performed worse on all 3 measures and showed considerably more performance variability. Post hoc comparisons between the 2 groups' performance at the different times of the day during testing were compared with Scheffe F tests (Table 3). Although large differences between the performances of the 2 groups were observed for time taken to perform the tasks, only the difference at 9 AM was found to be statistically significant. This can be accounted for by the large performance variability shown by the alcohol group.
The results of the 2-factor ANOVA are shown in Table 4. Overall, the factor 1 (group) differences between the 2 groups were not statistically significant for time (P = .29) but were significant for economy of diathermy (P = .03) and errors (P = .003). The ANOVA results for factor 2 (time of testing) showed highly significant differences as a function of time (P = .003), economy of diathermy (P < .001), and error scores (P = .001).
Differences between the experts' performances during the different assessment times were compared for significance with ANOVA for repeated measures and within-subject comparisons were conducted with Scheffe F tests (Table 5). At 9 AM on the day after consumption of excess alcohol, the experts completed the task faster than they did during the baseline period (Figure 3). However, by 1 PM, they were performing significantly worse than at 9 AM (F = 4.1; P < .01). Their performance had returned to baseline levels by 4 PM.
Figure 3C shows that the economy of diathermy scores of the experts deteriorated as the day progressed, with significant differences between the different testing times (F = 14.62; P < .001) (Table 6). In comparison with baseline, the experts were less efficient in their use of diathermy at 9 AM (F = 3.1; P < .05), 1 PM (F = 4.33; P < .05), and 4 PM (F = 14.52; P < .01). For error scores, there were significant differences between testing times (F = 11.2; P < .001). In comparison with baseline assessment, the experts made more errors at 9 AM, 1 PM, and 4 PM, but only the difference at 1 PM was statistically significant (9 AM: F = 0.44; P = .32; 1 PM: F = 9.82; P < .001; and 4 PM: F = 2.93; P = .06).
The acute effects of alcohol intoxication8 are well recognized and have been extensively researched.9,10 However, far less research has been conducted on delayed postintoxication effects.11 The hangover has substantial negative impact in terms of morbidity and societal costs, with more than $148 billion lost annually in the United States because of absenteeism and poor job performance.11 In the United Kingdom, alcohol use has accounted for more than £2 billion in lost wages every year, principally because of hangover-related absenteeism.12 There is no consensus definition of hangover and most studies identify various constellations of symptoms, including headache, diarrhea, anorexia, fatigue, and nausea.13 Although the overall experience is subjective, objective criteria include hemodynamic and hormonal alterations and impairment in cognitive, perceptual, and visuospatial performance.8
Laparoscopic or minimally invasive surgery poses considerable cognitive, perceptual, and psychomotor challenges for the operating surgeon.14 In this environment, even the most experienced surgeon is forced to work at the very limits of their abilities.15,16 It has been proposed that these difficulties contribute to the higher operative complication rate observed early in the surgeon's minimally invasive surgery experience.17,18
Surgeon behavioral factors may very well compound vulnerability to poor performance and surgical errors, and alcohol use may very well be prevalent in this regard. Historically, the medical profession has had a reputation for high rates of alcohol consumption. In 1 study, 42% of surveyed health care workers reported having had a hangover in the workplace.19 It has been estimated that 1 in 15 physicians in the United Kingdom has some form of substance dependence,20 with stress often identified as an inciting factor.21,22 Although some studies have identified higher rates of alcohol consumption in physicians when compared with age-matched controls,22-24 with surgery cited as a specialty with particularly high rates,25 other studies have found similar rates to what is found in the general population.26-28 However, it is accepted that the effects of alcohol consumption have potentially more serious implications in physicians given the responsibilities and nature of their position.26-28 Alcohol use and fatigue have been contrasted in investigations of surgically relevant performance. A blood alcohol concentration of 0.1% has been shown to cause cognitive psychomotor performance impairment similar to that seen after 24 hours of wakefulness.29 Postcall impairment in residents has been reported to be similar to that seen with a blood alcohol concentration of 0.04% to 0.05%.30
In previous work that has specifically examined the effects of alcohol on laparoscopic performance, standard doses of alcohol and food were administered in a controlled setting30 with subjects blinded as to whether they were assigned to alcohol or placebo groups.3 While tight control of these factors may prevent some confounding effects on data, we felt it was vital to conduct our study in as ecologically valid a setting as possible, with subjects consuming alcohol under social conditions that might normally be encountered, in a setting that mimicked real-life experience. We did not set out to correlate actual blood alcohol levels with degree of impairment because this was not the aim of the study, and for the same reason, we did not influence the amount of alcohol that was consumed or other factors, such as food intake. Similarly, we did not explicitly select subjects based on their usual levels of alcohol consumption, although all subjects consumed alcohol on a social basis. Under naturalistic conditions, it was our aim to demonstrate the effects of excessive alcohol consumption on laparoscopic surgical performance in a time frame after the consumption event that might be relevant to another real-world behavior: working the day after alcohol consumption.
Alcohol-associated performance degradation was found in study 1 and persisted until 4 PM the following day. This finding was observed across all scores at all points when compared with baseline; however, these differences did not reach statistical significance for the time measures probably because of the very large variability in the alcohol group's performance. Performance variability is relevant to the surgeon because consistency in performance is vital. It is clear from the data reported herein that excess alcohol consumed the evening before testing had a very marked effect on both of these parameters.
Because the psychomotor skills expected in the novice group may not have been sufficiently “hard wired” or automated,31 we deemed it necessary to examine experts' performance with similar alcohol exposure. Our expert group consisted both of experienced surgeons and expert MIST-VR users. Although by 4 PM their time to complete the task had returned to baseline, their error rate was higher (significantly so at 1 PM) and economy of diathermy use was significantly worse throughout the following day. These differences were observed despite the extensive surgical and MIST-VR experience of the subjects in the expert group.
Time taken to complete the task may not be a very useful metric when taken in isolation. However, it can be an indicator of surgical efficiency. It is unclear at this time why the expert subjects had a paradoxically faster mean task completion time at 9 AM relative to the baseline assessment. While it is possible that elements of either training effect or loss of inhibition may be contributory, when taken in combination with the significantly worsened error score and diathermy efficiency, it cannot be characterized as an indicator of improved performance.
These data raise inevitable questions pertaining to the clinical relevance of the simulator findings. While it is ethically impossible to conduct a study such as this in a clinical setting, advances in simulation have made it possible to assess some of the skills that are required to operate safely. The MIST-VR is currently the best-validated simulator in procedural-based medicine. While it is a low-fidelity system, for good performance it demands that the user demonstrate the skills required to perform a laparoscopic cholecystectomy. In prospective randomized trials, it has also been shown that skills acquired on simulators such as the MIST-VR transfer into the real clinical environment.7,32 Although we are confident that current simulator technology measures skills relevant to operative performance, we cannot comment with certainty on the significance of the performance impairment observed in this study, except to say that it warrants further investigation.
One subject in study 2 had a blood alcohol level that exceeded the legal limit for driving when measured by the breathalyzer the morning after. We did not measure the amount of alcohol consumed by the subjects because correlation of impairment with blood alcohol level was not the aim of the study. The other subjects would have been legally permitted to drive despite their significant performance deviation relative to baseline. This raises the troubling possibility that surgical performance could be adversely affected despite a low or zero residual blood alcohol level, or even despite an absence of symptoms one might associate with a “hangover.” This phenomenon is known as “postalcohol impairment”33 and has been studied in other fields such as psychiatry and the aviation industry. Although this is the basis for “bottle-to-throttle” time mandates introduced in 1971 by the Federal Aviation Administration, some studies have shown that pilot performance can be impaired for longer than the current 8-hour recommendation. In 1 study, pilots showed residual performance impairment when their blood alcohol concentration had returned to zero, 14 hours after alcohol ingestion that had produced a blood alcohol concentration of 0.1%.34 Currently, blood alcohol concentration limits are set by national bodies for pilots and other crew members.34,35 There are no international standards and some airlines have a zero-tolerance policy for alcohol, with mandatory preflight breath testing for every member of the crew.
The persistence of surgical performance impairment for such an extended period demands further consideration of this issue and its implications, at least for the specific skills set used during laparoscopic surgery. There are no rules or guidelines to govern consumption of alcohol the night before operative duties, and there is insufficient information to permit clear-cut recommendations for a “bottle-to-scalpel” interval to be made. However, it is likely that surgeons are unaware that next-day surgical performance may be compromised as a result of significant alcohol intake. Without taking up the larger societal problem of distinguishing between “acceptable” and “excessive” alcohol consumption, it is sensible to make surgeons and other medical interventionalists aware of the scope and duration of alcohol-related impairment following excessive alcohol consumption, with the aim of instilling a higher level of personal vigilance.
In the 2 studies reported herein, we showed persistent detrimental performance effects the day after excessive alcohol had been consumed. This effect was observed even for very experienced laparoscopic surgeons, the majority of who showed minimal or zero blood alcohol levels when they were tested by traditional means, ie, a breathalyzer. Given the considerable cognitive, perceptual, visuospatial, and psychomotor challenges posed by modern image-guided surgical techniques, abstinence from alcohol the night before operating may be a sensible consideration for practicing surgeons.
Correspondence: Emily Boyle, MRCS, National Surgical Training Centre, Royal College of Surgeons in Ireland, RCSI House, 121 St Stephen's Green, Dublin 2, Ireland (email@example.com).
Accepted for Publication: March 11, 2010.
Author Contributions:Study concept and design: Gallagher and Toner. Acquisition of data: Gallagher, Toner, Andersen, Satava, and Seymour. Analysis and interpretation of data: Gallagher, Boyle, Toner, Neary, and Satava. Drafting of the manuscript: Gallagher, Boyle, Neary, and Seymour. Critical revision of the manuscript for important intellectual content: Gallagher, Boyle, Toner, Andersen, Satava, and Seymour. Statistical analysis: Gallagher and Toner. Obtained funding: Gallagher. Administrative, technical, and material support: Gallagher, Boyle, Toner, and Andersen. Study supervision: Gallagher.
Create a personal account or sign in to: