Key PointsQuestion
How is burnout assessed among physicians and what is the prevalence of burnout among physicians?
Findings
In this systematic review, there was substantial variability in prevalence estimates of burnout among physicians, ranging from 0% to 80.5%, and marked variation in burnout definitions, assessment methods, and study quality. Associations between burnout and sex, age, geography, time, specialty, and depressive symptoms could not be reliably determined.
Meaning
These findings preclude definitive conclusions about the prevalence of burnout among physicians and highlight the importance of developing a consensus definition of burnout and of standardizing measurement tools to assess the effects of chronic occupational stress on physicians.
Importance
Burnout is a self-reported job-related syndrome increasingly recognized as a critical factor affecting physicians and their patients. An accurate estimate of burnout prevalence among physicians would have important health policy implications, but the overall prevalence is unknown.
Objective
To characterize the methods used to assess burnout and provide an estimate of the prevalence of physician burnout.
Data Sources and Study Selection
Systematic search of EMBASE, ERIC, MEDLINE/PubMed, psycARTICLES, and psycINFO for studies on the prevalence of burnout in practicing physicians (ie, excluding physicians in training) published before June 1, 2018.
Data Extraction and Synthesis
Burnout prevalence and study characteristics were extracted independently by 3 investigators. Although meta-analytic pooling was planned, variation in study designs and burnout ascertainment methods, as well as statistical heterogeneity, made quantitative pooling inappropriate. Therefore, studies were summarized descriptively and assessed qualitatively.
Main Outcomes and Measures
Point or period prevalence of burnout assessed by questionnaire.
Results
Burnout prevalence data were extracted from 182 studies involving 109 628 individuals in 45 countries published between 1991 and 2018. In all, 85.7% (156/182) of studies used a version of the Maslach Burnout Inventory (MBI) to assess burnout. Studies variably reported prevalence estimates of overall burnout or burnout subcomponents: 67.0% (122/182) on overall burnout, 72.0% (131/182) on emotional exhaustion, 68.1% (124/182) on depersonalization, and 63.2% (115/182) on low personal accomplishment. Studies used at least 142 unique definitions for meeting overall burnout or burnout subscale criteria, indicating substantial disagreement in the literature on what constituted burnout. Studies variably defined burnout based on predefined cutoff scores or sample quantiles and used markedly different cutoff definitions. Among studies using instruments based on the MBI, there were at least 47 distinct definitions of overall burnout prevalence and 29, 26, and 26 definitions of emotional exhaustion, depersonalization, and low personal accomplishment prevalence, respectively. Overall burnout prevalence ranged from 0% to 80.5%. Emotional exhaustion, depersonalization, and low personal accomplishment prevalence ranged from 0% to 86.2%, 0% to 89.9%, and 0% to 87.1%, respectively. Because of inconsistencies in definitions of and assessment methods for burnout across studies, associations between burnout and sex, age, geography, time, specialty, and depressive symptoms could not be reliably determined.
Conclusions and Relevance
In this systematic review, there was substantial variability in prevalence estimates of burnout among practicing physicians and marked variation in burnout definitions, assessment methods, and study quality. These findings preclude definitive conclusions about the prevalence of burnout and highlight the importance of developing a consensus definition of burnout and of standardizing measurement tools to assess the effects of chronic occupational stress on physicians.
The concept of burnout in health care emerged in the late 1960s as a way to colloquially describe the emotional and psychological stress experienced by clinic staff caring for structurally vulnerable patients in free clinics.1 Since then, the term burnout has been used to characterize job-related stress in any health practice environment, from hospitals in urban communities to global health settings.2,3 This expansion of the scope of burnout has made it useful for describing the shared experience and stress of medical practice, particularly in conjunction with research demonstrating elevated levels of depressive symptoms among physicians.4,5 Building on foundational work by Maslach et al6 in the 1980s, researchers have described burnout as a combination of emotional exhaustion, depersonalization, and low personal accomplishment caused by the chronic stress of medical practice. In the research literature, “overall” or “aggregate” burnout is typically measured by assessing some combination of these 3 subcomponents. Some studies have found that physician burnout is associated with increased medical errors, lower patient satisfaction, longer postdischarge recovery times, and decreased professional work effort.7-9 Consequently, there is interest among researchers, clinicians, and health policy leaders in ascertaining the prevalence and drivers of burnout in physicians.
The objective of this systematic review was to assess how burnout among practicing physicians has been defined in the literature and to identify the prevalence of burnout in this population.
Search Strategy and Study Eligibility
Three authors (L.S.R., M.T., and R.C.R.) independently identified cross-sectional and longitudinal studies published before June 1, 2018, that reported on the prevalence of burnout among practicing physicians (ie, excluding medical students and resident physicians) by systematically searching EMBASE, ERIC, MEDLINE/PubMed, psycARTICLES, and psycINFO. In addition, the authors screened the reference lists of articles identified and corresponded with study investigators using approaches consistent with the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) and Meta-analysis of Observational Studies in Epidemiology (MOOSE) reporting guidelines.10,11 For the database searches, terms related to physicians and study design were combined with those related to burnout without language restriction (full details of the search strategy are provided in eAppendix 1 in Supplement 1). Studies that reported data on practicing physicians, were published in peer-reviewed journals, and used a well-described method to assess for burnout were included. A fourth author (D.A.M.) resolved discrepancies by discussion and adjudication.
Data Extraction and Quality Assessment
Three authors (L.S.R., M.T., and R.C.R) independently extracted the following data from each article using a standardized form: study design; geographic location; year(s) of survey; sample size; specialty; average age of participants; number and percentage of male participants; diagnostic or screening method used; outcome definition (ie, specific diagnostic criteria or screening instrument cutoff); and reported prevalence estimates of overall burnout, its subcomponents of emotional exhaustion, depersonalization, and a diminished sense of personal accomplishment, or both. Whether studies reported prevalence estimates of comorbid depression or depressive symptoms was also noted. When studies involved the same population of physicians, only the most comprehensive or recent publication was included, with the former taking precedence. The 3 authors independently assessed the risk of bias of these predominantly nonrandomized studies using a modified version of the Newcastle-Ottawa Scale, which assessed sample representativeness and size, comparability between respondents and nonrespondents, ascertainment of burnout, and thoroughness of descriptive statistics reporting (full details regarding scoring are provided in eAppendix 2 in Supplement 1).12 A fourth author (D.A.M.) resolved discrepancies by discussion and adjudication.
Data Synthesis and Analysis
As described in the prespecified study protocol (eAppendixes 3-4 in Supplement 1), the study was originally designed to perform a meta-analysis, including an assessment of heterogeneity in burnout ascertainment methods, definitions, and outcomes, as well as statistical heterogeneity and bias from small study effects. However, as described below in the Results section, the pooled quantitative summary estimates were judged to not be reliable. Therefore, the entire body of studies was summarized descriptively and a qualitative synthesis of a subset of larger studies was also performed. Studies were included in the qualitative synthesis if they had at least 300 participants, used a full-length instrument to assess burnout, and clearly indicated the criteria used to label individuals as experiencing burnout. Studies using short-form survey instruments (eg, single question) or ill-defined survey instruments (eg, instrument was not described or no cutoff score was reported or referenced) to assess burnout were excluded from the qualitative synthesis regardless of the number of participants on which they reported.
One hundred seventy-six cross-sectional studies and 6 longitudinal studies involving 109 628 individuals in 45 countries published between 1991 and 2018 reporting on burnout in practicing physicians were identified (Figure 1).13-194 The number of participants per study ranged from 4 to 7830 (median, 200; interquartile range, 93-512; mean, 602). The characteristics of the full set of individual studies, the geographic regions in which they were conducted, and their Newcastle-Ottawa risk-of-bias scores appear in eTables 1 through 4 in Supplement 1. In all, 18.1% (33/182) of the studies also reported on the prevalence of screening positive for depression as assessed by various self-report questionnaires (eTable 5 in Supplement 1). A subset of 45 larger studies involving 65 327 individuals in 20 countries published between 1991 and 2018 met the inclusion criteria for the qualitative synthesis (Table 1).13-57
Instruments Used to Assess Burnout
Among the full set of 182 studies, 67.0% (122/182) reported prevalence estimates of overall burnout, 72.0% (131/182) reported prevalence estimates of emotional exhaustion, 68.1% (124/182) reported prevalence estimates of depersonalization, and 63.2% (115/182) reported prevalence estimates of a diminished sense of personal accomplishment. In all, 85.7% (156/182) used a version of the proprietary Maslach Burnout Inventory (MBI)6 to generate these prevalence estimates, while 14.3% (26/182) used other methods. The burnout assessment instruments used by the 182 studies are summarized in Table 2.
Most studies (57.8% [108/182]) used a full-length implementation of the original version of the MBI, the 22-item MBI–Human Services Survey (MBI-HSS), designed to measure feelings of burnout among individuals working in human services jobs, like physicians. Fewer studies (4.8% [9/182]) used a full-length implementation of the 16-item MBI–General Survey (MBI-GS), designed to measure feelings of burnout among individuals in non–human services occupations. The MBI-GS focuses on burnout related to the general performance of work rather than on relationships at work (eg, with patients). Both MBI versions ask survey takers to rate how often they experience specific feelings of burnout at work on a 7-point Likert scale, with 0 representing “never” and 6 “every day” (examples of included items are provided in eAppendix 5 in Supplement 1). The MBI-HSS produces scores on 3 subscales: emotional exhaustion (scores range from 0-54), depersonalization (scores range from 0-30), and low personal accomplishment (scores range from 0-48). Because the MBI-GS deemphasizes human relationships, it renames the subscales as exhaustion, cynicism, and professional efficacy, although the concepts measured by both versions of the inventory are similar. In contrast to the MBI-HSS, subscale scores for the MBI-GS are usually determined by calculating mean ratings across relevant questions, with mean scores ranging from 0 to 6 for all 3 subscales. Several (16.5% [30/182]) studies used assessment instruments based on one of these full-length MBI surveys but modified in some manner, as by altering the text of the presented statements related to burnout or shortening the number of items or subscales on the inventory. For example, 4.4% (8/182) of studies used single-item burnout assessment tools for emotional exhaustion or depersonalization that were adapted from the MBI-HSS and validated by West et al.195 Some studies (4.9% [9/182]) did not specify what version of the MBI they used. For all versions of the MBI, higher scores on the emotional exhaustion and depersonalization subscales and lower scores on the personal accomplishment subscale (or their MBI-GS equivalents) correspond to higher levels of burnout.
Several public domain methods were used by the 14.3% (26/182) of studies that did not use the MBI to assess burnout. These instruments included the 16-item Astudillo and Mendinueta Burnout Questionnaire,196 the 54-item Modified Compassion Satisfaction and Fatigue Test,182 the 19-item Copenhagen Burnout Inventory,197 the 40-item Hamburg Burnout Inventory,198 the Pines and Aronson Burnout Measure,199 the 20-item Spanish-language Questionnaire for the Evaluation of Work-Related Burnout Syndrome (CESQT),200 the 10-item Zero Burnout Program Survey,201 and various single-item measures of self-perceived burnout, including the measure of Rohland et al.152 Some studies used abbreviated or modified surveys based on these instruments, with some conceptualizing burnout differently than the traditional definition in the MBI. For example, as described by Kristensen et al,197 the Copenhagen Burnout Inventory was developed in response to perceived limitations of the MBI and conceptualizes burnout as consisting of domains referred to as personal, work-related, and client-related burnout, considering the core of burnout as symptoms of fatigue and exhaustion.
Prevalence of Overall Burnout Among Physicians
The prevalence estimates of overall burnout reported by the 67.0% (122/182) of studies that provided data on overall burnout ranged from 0% to 80.5%. Meta-analytic pooling of the prevalence estimates is shown in eTable 6 in Supplement 1 but is not considered reliable because of heterogeneity in burnout ascertainment methods, definitions, and outcomes, as well as statistical heterogeneity. This heterogeneity persisted after stratifying the analyses by screening instrument and cutoff score, in part because of the considerable variability in how studies defined overall burnout (eTable 7 in Supplement 1). Considering all combinations of subscale cutoff scores used, there were at least 58 unique ways of labeling individuals as experiencing burnout (eTable 8 in Supplement 1). Even among the 80.3% (98/122) of studies using an inventory based on the MBI, there were at least 47 unique implementations of MBI versions, cutoff combinations, or both. For example, the most frequent definition of overall burnout, used by 17.2% (21/122) of studies, required individuals to score all of at least 27, at least 10, and no more than 33 on the MBI exhaustion, depersonalization, and personal accomplishment subscales, respectively. The second most frequent definition, used by 9.0% (11/122) of studies, was more lenient in that it considered individuals to have burnout if they scored either at least 27 on the exhaustion or at least 10 on the depersonalization subscales or both. There were at least 11 different methods for measuring burnout represented among the 19.7% (24/122) of studies that did not use the MBI. Among this group, the most frequently used techniques (12.3% [15/122]) were various single-item screens of self-perceived burnout, most notably a Rohland score of at least 3, used by 4.9% (6/122) of studies. This heterogeneity is illustrated by visual inspection of the prevalence estimates from the subset of larger studies included in the qualitative synthesis, 75.6% (34/45) of which reported on overall burnout using 18 unique screening instruments, cutoff combinations, or both (Figure 2).
Prevalence of Burnout Subcomponents Among Physicians
There was also important heterogeneity in assessment methods and definitions for burnout subcomponents, precluding reliable meta-analysis (eTables 9-14 in Supplement 1). The prevalence estimates of emotional exhaustion reported by the 72.0% (131/182) of studies that provided data ranged from 0% to 86.2%. For MBI-derived emotional exhaustion, 43.5% (57/131) of studies used a cutoff score of at least 27, 16.8% (22/131) used a cutoff of “high” without explicitly stating a cutoff score, 29.8% (39/131) used a different cutoff score, and 9.2% (12/131) used a nonstandard or shortened version of the MBI (eg, a single-question screening tool). A single study used a non–MBI-based assessment method, a tertile-based split of CESQT scores, to identify individuals with emotional exhaustion. This heterogeneity is illustrated by visual inspection of the prevalence estimates from the studies included in the qualitative synthesis, 73.3% (33/45) of which reported on emotional exhaustion (Figure 3).
The prevalence estimates of depersonalization reported by the 68.1% (124/182) of studies that provided data ranged from 0% to 89.9%. For MBI-derived depersonalization, 33.1% (41/124) of studies used a cutoff score of at least 10, 13.7% (17/124) used a cutoff score of at least 13, 16.9% (21/124) used a cutoff of “high” without explicitly stating a cutoff score, 26.6% (33/124) used a different cutoff score, and 8.9% (11/124) used a nonstandard or shortened version of the MBI. A single study used a tertile-based split of CESQT scores to identify individuals experiencing depersonalization. This heterogeneity is illustrated by visual inspection of the prevalence estimates from the studies included in the qualitative synthesis, 66.7% (30/45) of which reported on depersonalization (Figure 4).
The prevalence estimates of a diminished sense of personal accomplishment reported by the 63.2% (115/182) of studies that provided data ranged from 0% to 87.1%. For MBI-derived low personal accomplishment, 34.8% (40/115) of studies used a cutoff of no more than 33, 12.2% (14/115) used a cutoff of no more than 31, 17.4% (20/115) used a cutoff of “low” without explicitly stating a cutoff score, 28.7% (33/115) used a different cutoff score, and 6.1% (7/115) used a nonstandard or shortened version of the MBI. A single study used a tertile-based split of CESQT scores to identify individuals experiencing a diminished sense of personal accomplishment. This heterogeneity is illustrated by visual inspection of the prevalence estimates from the studies included in the qualitative synthesis, 62.2% (28/45) of which reported on personal accomplishment (Figure 5).
Prevalence of Burnout and Its Subcomponents Among Physicians by Study-Level Characteristics
The observed heterogeneity precluded reliable investigation of the associations of overall burnout or burnout subcomponent prevalence with the geographic region in which studies were conducted, the subspecialties of the study participants, the baseline survey year, the mean or median age of the study participants, the percentage of male study participants, or the presence or absence of comorbid depressive symptoms, the latter of which were also examined independently of burnout (eTables 15-26 in Supplement 1). To identify potential sources of heterogeneity independent of assessment method, heterogeneity was also examined within subgroups of studies using common instruments when at least 15 studies were available. However, heterogeneity within all subgroups remained too high for meaningful meta-analyses (eTable 27 in Supplement 1).
Based on the modified Newcastle-Ottawa risk-of-bias scores assigned to the studies, most had limitations in study quality (eTable 4 in Supplement 1). For example, only 32.4% (59/182) of studies fulfilled the criterion for sample representativeness by surveying physicians of multiple specialties at multiple institutions. Only 40.1% (73/182) met the size criterion by surveying at least 300 participants. Only 6.6% (12/182) established the comparability between respondents and nonrespondents and only 33.5% (61/182) reported descriptive statistics for participants who did respond. Although 87.9% (160/182) met the ascertainment criteria by using a well-described or validated tool to measure burnout, the value of this finding is unclear given that the validity of the burnout construct (particularly as measured by the MBI) is uncertain. Visual inspection of funnel plots for all outcomes yielded minimal evidence of small study effects, with statistically significant asymmetry only for overall burnout (eFigure in Supplement 1).
Table 1 details the subset of 45 larger studies selected for more in-depth qualitative consideration. Most of these studies used either the 22-item MBI-HSS (66.7% [30/45]) or the 16-item MBI-GS (13.3% [6/45]). The Dutch adaptation of the MBI-HSS, the 20-item Utrechtse Burnout Schaal, was used by 6.7% (3/45) of studies. A 19-item version of the MBI-HSS adapted to a Chinese context, a 15-item shortened version of the MBI-HSS, and versions of the MBI-HSS and MBI-GS focused on emotional exhaustion alone were also used by individual studies. The Zero Burnout Program Survey and the Hamburg Burnout Inventory were also used by individual studies. Among these 45 studies, 75.6% (34/45) generated prevalence estimates of overall burnout. The criteria used to label individuals as experiencing burnout varied widely, including the number of subscales on which participants needed to screen positive to constitute experiencing burnout (Table 1 and Figure 2).
Ten studies provided overall burnout prevalence estimates using relatively permissive MBI-HSS criteria, classifying individuals as having symptoms of burnout if they exceeded either a specific cutoff for elevated emotional exhaustion or depersonalization. Six studies defined burnout as either an emotional exhaustion score of at least 27 or a depersonalization score of at least 10.25,42,47,49-51 This definition of burnout led to prevalence estimates ranging from 25.0% to 60.1%. For example, Pedersen et al42 examined burnout among Danish general practitioners and found a 25.0% prevalence, and Busis et al25 examined burnout among US neurologists and found a 60.1% prevalence. Four studies by Shanafelt et al47,49-51 examined burnout among US physicians of all specialties using these cutoff score combinations. In a 2015 longitudinal study, Shanafelt et al50 found that the prevalence of physicians reporting burnout symptoms had increased from 45.5% to 54.4% between 2011 and 2014. Two studies of surgeons defined burnout as either an emotional exhaustion score of at least 28 or a depersonalization score of at least 11. In a 2008 study, Shanafelt et al48 surveyed surgeons of multiple subspecialties, identifying a burnout symptom prevalence of 39.6%. In a study limited to plastic surgeons, Qureshi et al44 found a prevalence of 30.0% using these criteria. Two studies used cutoffs of at least 27 or at least 13 for emotional exhaustion or depersonalization, respectively. Kamal et al33 reported a prevalence of 61.9% among US palliative care physicians and Li et al36 reported a prevalence of 69.6% among Chinese anesthesiologists using these criteria.
Six studies took a more stringent approach by requiring that at least 2 of 3 MBI subscales be positive to constitute burnout. In their study of urologists in Ireland and the United Kingdom, O’Kelly et al39 defined burnout as an MBI-HSS cutoff of at least 27 for emotional exhaustion combined with either a cutoff of at least 13 for depersonalization or no more than 31 for personal accomplishment, generating a burnout prevalence of 28.9%. Twellaar et al17 and Van der Wal et al19 took a similar approach using the Utrechtse Burnout Schaal inventory. They required that participants have an exhaustion score above the top quartile combined with either a depersonalization score above the top quartile or a personal accomplishment score below the bottom quartile. Using these criteria, they calculated prevalence estimates of 19.5% and 19.8% among Dutch general practitioners and anesthesiologists, respectively. Two studies took a similar approach using the MBI-GS. Saijo et al46 defined burnout as a mean exhaustion score greater than 4.2 combined with either a cynicism score greater than 2.4 or a professional efficacy score of no more than 2.5, finding a 22.1% prevalence among Japanese physicians of multiple specialties. Nishimura et al38 defined burnout as a mean exhaustion score greater than 4.0 combined with either a cynicism score greater than 2.6 or a professional efficacy score less than 4.17, finding a 21.6% prevalence among Japanese neurologists and neurosurgeons. In their study of surgeons in the United Kingdom, Upton et al18 defined burnout as both an exhaustion score and a cynicism score above the top tertile, regardless of the professional efficacy score, generating a prevalence of 19.8%.
Several studies used even stricter definitions of overall burnout, requiring all 3 MBI subscales to be positive to constitute a case. Nine studies22,24,31,32,35,37,40,41,43 each used the MBI-HSS to survey physicians in a variety of specialties, specifying that individuals have an emotional exhaustion score of at least 27, a depersonalization score of at least 10, and a personal accomplishment score of no greater than 33 to be considered as having symptoms of burnout. This approach to defining burnout generated lower prevalence estimates, ranging from 2.6% to 11.8% across studies. For example, in a longitudinal study of Danish general practitioners, Pedersen et al41 showed that burnout prevalence had increased from 2.6% to 3.7% between 2004 and 2012 and calculated a 7-year burnout incidence of 13.0%. A separate study of Danish general practitioners by Brøndt et al24 demonstrated the effect that strict diagnostic criteria may have on burnout prevalence. In their study, only 2.6% of physicians met the strict criteria mentioned above, but a separate analysis defining burnout as either an emotional exhaustion score of at least 27 or a depersonalization score of at least 10 resulted in a higher prevalence of 24.1%.
Five other studies also used strict definitions of overall burnout, each using slightly different criteria. For example, Al-Dubai et al21 required all 3 subscales of the MBI-HSS to be positive. Using an emotional exhaustion score of at least 27, a depersonalization score of at least 13, and a personal accomplishment score of no more than 31, they demonstrated a burnout symptom prevalence of 11.7% among Yemeni physicians across multiple specialties. Riquelme et al16 took a similar approach using the MBI-HSS but defined subscale positivity by quartile-based cutoffs, demonstrating a burnout prevalence of 7.3% among Spanish pain medicine physicians. In their study of Belgian physicians in multiple specialties, Vandenbroeck et al53 similarly required that all 3 MBI subscales be positive. Using the Utrechtse Burnout Schaal, they required a mean emotional exhaustion score of at least 2.5, a mean depersonalization score of at least 1.6 (for women) or at least 1.8 (for men), and a mean personal accomplishment score of no more than 3.7 to constitute burnout, demonstrating a prevalence of 5.1%. Rao et al186 and Wu et al55 both used the MBI-GS to assess burnout using relatively strict criteria. In their study of administrative burden among US physicians in multiple specialties, Rao et al186 used mean MBI-GS subscale cutoffs of at least 3.2, at least 2.6, and no more than 3.8, for exhaustion, cynicism, and professional efficacy, respectively, demonstrating a burnout prevalence of 9.8%. Wu et al55 surveyed Chinese physicians of various specialties, using cutoffs of at least 14, at least 10, and no more than 17, respectively, demonstrating a burnout prevalence of 12.1%.
Four studies defined burnout using either modified versions of the MBI or other inventories. Wang et al54 used a revised 19-item Chinese version of the MBI-HSS and assessed overall burnout via a weighted equation, with a score of at least 4.5 indicating severe burnout (0.4 × exhaustion + 0.3 × depersonalization + 0.3 × reduced personal accomplishment). Using this criterion, 5.9% of physicians across multiple specialties from Shanghai hospitals were considered to have symptoms of burnout. In their study of Portuguese physicians in multiple specialties, Marôco et al14 used a 15-item modified version of the MBI-HSS, considering a mean subscale score of at least 3 as the cutoff for burnout, generating a prevalence of 43.6%. Puffer et al15 demonstrated a burnout prevalence of 24.5% among US physicians using the Zero Burnout Program Survey with a cutoff score of at least 3. For their study of Austrian physicians, Wurm et al56 used the Hamburg Burnout Inventory, in part because of its validation in the German language. A score of at least 145 was considered the cutoff for at least mild burnout, resulting in an overall prevalence of 50.7%. They further classified 28.0% of participants as having mild, 13.1% as having moderate, and 9.6% as having severe burnout symptoms. Theirs was one of the few studies to also assess participants with a high-specificity screening tool for major depression, the 12-item World Health Organization Major Depression Inventory. Using these data, Wurm et al56 concluded that the Hamburg Burnout Inventory subscales for emotional exhaustion, detachment (ie, depersonalization), and personal accomplishment correlated more highly with the cardinal symptoms of depression (ie, sadness, lack of interest, and diminished energy) than with each other, demonstrating overlap of the concepts of burnout and depression in physicians.
Among the 45 studies, 73.3% (33/45) generated prevalence estimates of emotional exhaustion, depersonalization, or low personal accomplishment, including 11 studies13,20,23,26-30,34,52,57 that did not provide estimates of overall burnout. A wide range of cutoff scores was used (Table 1). The most common criterion for defining emotional exhaustion was an MBI-HSS cutoff of at least 27, corresponding to symptoms experienced a few times per month, used by 63.6% (21/33) of studies reporting on this outcome. The most common criterion for defining depersonalization was an MBI-HSS cutoff of at least 10, corresponding to symptoms experienced once per month or less, used by 53.3% (16/30) of studies. The most common criterion for defining low personal accomplishment was an MBI-HSS cutoff of at least 33, corresponding to symptoms experienced approximately once per week, used by 46.4% (13/28) of studies. Overall, across the 33 studies that presented subscale prevalence data, 10, 10, and 10 unique instrument–cutoff score combinations were used to define emotional exhaustion, depersonalization, or low personal accomplishment (or their MBI-GS equivalents), respectively. With this diversity of cutoffs, emotional exhaustion prevalence ranged from 8.7% to 63.2%, depersonalization prevalence ranged from 3.9% to 52.0%, and low personal accomplishment prevalence ranged from 4.4% to 73.3% (Figure 3, Figure 4, and Figure 5).
This systematic review of 182 studies involving 109 628 physicians in 45 countries demonstrated remarkable variability in published prevalence estimates of burnout, with estimates of overall burnout ranging from 0% to 80.5%. This wide range reflected the marked heterogeneity in the criteria used to define and measure burnout in the literature, with at least 142 unique definitions for meeting overall burnout or burnout subscale criteria identified. This review identified a lack of consensus on how the burnout construct is used to measure physicians’ exposure and response to occupational stress. Although a prevalence of 50% for physician burnout has been cited in the popular press202 and academic literature,203 the heterogeneity between the assessed studies calls into question whether any prevalence estimate cited for burnout can be meaningfully interpreted.
Research on burnout among physicians has increased awareness of physician mental health and well-being as an important issue,204 and US national organizations have recently called for all health care systems to assess their physicians on measures of well-being, often with a focus on burnout.205 This review indicates that a more consistent definition of burnout and improved assessment tools may be necessary if these policy measures are to successfully improve the physician work environment.
The methodological heterogeneity among the studies included in this systematic review may have been driven in part by shifting definitions of burnout and by questions around the conceptual framework of the burnout construct. The majority of the studies used an inventory based on the MBI, which considers burnout to consist of 3 domains: emotional exhaustion, depersonalization, and low personal accomplishment.6 The older third edition of the MBI manual provided cutoff scores to define burnout according to tertile-based splits of convenience samples of healthy workers, although the manual cautioned against using such coding for diagnostic purposes.206 Separately, Maslach supported defining overall burnout as high emotional exhaustion along with high depersonalization or low personal accomplishment.207 Others have asserted that high emotional exhaustion or high depersonalization but not low personal accomplishment can differentiate individuals with burnout from those who are not experiencing burnout208; some have suggested that personal accomplishment may not be a part of the total concept of burnout.209
The clinical validity of these definitions is not certain. The most commonly used MBI cutoff score for high emotional exhaustion (≥27, used by 43.5% of studies) corresponds to symptoms experienced only a few times per month on average. The most commonly used cutoff score for high depersonalization (≥10, used by 33.1% of studies) corresponds to symptoms experienced once per month or less on average. And the most commonly used cutoff score for low personal accomplishment (≤33, used by 34.8% of studies) corresponds to symptoms experienced only once per week on average. Symptoms experienced this infrequently are unlikely to reflect clinically meaningful levels of burnout.210 The prevalence estimates summarized in this systematic review therefore primarily reflect symptoms of burnout rather than a clinical burnout syndrome. With these and other concerns,207 researchers have used alternate subscale and overall burnout cutoffs, adding to the proliferation of definitions. The current fourth edition of the MBI manual more strongly advocates that researchers treat burnout as continuous data for each domain and argues against dichotomizing or combining the subscales to label individuals as having burnout.6 However, dichotomous burnout definitions may be more practical to guide institutional policy and identify physicians with burnout.
In addition to the different definitions of burnout, the heterogeneity among the published studies may be due to fundamental problems with the conceptualization and measurement of burnout through the MBI. This inventory was originally developed not on the basis of clinical observation but rather by inductive factor analysis of what has been described as a “rather arbitrary” set of items,211 leading to questions about the validity of MBI-measured burnout.197 Although the MBI conceptualizes burnout as a job-related phenomenon, evidence suggests that it does not effectively distinguish between symptoms that arise from work stress, from nonwork stress, or from a combination of the two.212 The original and still most commonly used version of the MBI, the MBI-HSS, conceptualizes burnout specifically as a downstream consequence of human relations–induced stress.6 However, a possible increase in the prevalence of burnout among physicians has corresponded with an increasing volume of non–patient-focused work such as with the electronic medical record,213 whereas increased time with patients has instead been positively associated with physician mental well-being.214 In addition, the MBI combines the experience of burnout (emotional exhaustion) with coping strategies (depersonalization), creating a unitary measure that may not represent any singular clinical phenomenon.197 It has therefore been suggested that rigorous clinical observation may be needed to determine what constitutes a case of burnout.215
With these conceptual concerns, there is an argument for grounding burnout in a well-established illness category with known diagnostic criteria, such as major depressive disorder, and considering burnout a form of depression instead of a distinct entity.216 However, there may be advantages to considering burnout as a distinct entity.217 In contrast to depression, the concept of burnout avoids pathologizing workers’ emotional responses to their jobs. Understanding health practitioners as workers with burnout instead of as patients with depression may help underscore the environmental and cultural factors that can negatively affect their well-being and encourage implementation of structural reforms that can complement clinical care in the form of psychotherapy and medication.218
Given the lack of a clear consensus among the 182 studies included in this review, researchers studying burnout should consider limitations associated with the concept and its measurement. First, use of arbitrary and varying definitions of dichotomized burnout likely contributed to the heterogeneity. In the absence of agreed-on diagnostic criteria for a clinical burnout syndrome, future studies may consider analyzing burnout exclusively as a continuous measure. Second, researchers who nonetheless wish to generate dichotomous burnout outcomes should consider reporting multiple prevalence estimates using a range of cutoff scores. Third, given limitations in the MBI, the most common measurement tool for burnout, researchers should consider using other tools, such as the Copenhagen Burnout Inventory, that explicitly avoid these conceptual problems and are freely available in the public domain.197
Fourth, to better capture the broader adverse effects of physician stress, researchers should consider using validated instruments to longitudinally assess for concurrent depression, anxiety, substance abuse, and medical illness along with consistent measures of the subjective and workplace factors that shape the physician experience (eg, hours worked and compensation). Fifth, researchers should also more strictly adhere to the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines.
This study has several limitations. First, because the aim of the review was to estimate burnout prevalence, it excluded studies of burnout that did not report prevalence estimates. Second, the data were derived from studies with assorted designs, assessment instruments, and physician demographics, and the analyses were inherently limited by the ongoing nosological debate in the literature over what constitutes a case of burnout. Third, the studies included in the analysis focused disproportionately on the measurement of burnout among physicians in the United States and Europe. Fourth, the analysis relied on aggregated published data from the peer-reviewed literature and did not consider non–peer-reviewed data sources, such as informal annual surveys by Medscape.219
In this systematic review, there was substantial variability in prevalence estimates of burnout among physicians and marked variation in burnout definitions, assessment methods, and study quality. These findings preclude definitive conclusions about the prevalence of burnout and highlight the importance of developing a consensus definition of burnout and of standardizing measurement tools to assess the effects of chronic occupational stress on physicians.
Corresponding Author: Douglas A. Mata, MD, MPH, Program in Molecular Pathological Epidemiology, Department of Pathology, Brigham and Women’s Hospital, Brigham Education Institute, Harvard Medical School, 75 Francis St, Boston, MA 02115-6106 (dmata@bwh.harvard.edu)
Accepted for Publication: August 9, 2018.
Author Contributions: Dr Mata had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Rotenstein, Mata.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: Rotenstein, Ramos, Mata.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Mata.
Obtained funding: Guille, Sen, Mata.
Administrative, technical, or material support: Guille, Sen, Mata.
Supervision: Guille, Sen, Mata.
Conflict of Interest Disclosures: All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest and none were reported.
Funding/Support: This study received funding from the National Institutes of Health (grant R01MH101459 to Dr Sen).
Role of the Funder/Sponsor: The study funder had no role in the design and conduct of the study; collection, management, analysis, or interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.
Disclaimer: The opinions, results, and conclusions reported in this article are those of the authors and are independent from the funding sources.
Data Sharing Statement: See Supplement 2.
6.Maslach
C, Jackson
SE, Leiter
MP. Maslach Burnout Inventory Manual. 4th ed. Menlo Park, CA: Mind Garden Inc; 2016.
10.Moher
D, Liberati
A, Tetzlaff
J, Altman
DG; PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement.
Open Med. 2009;3(3):e123-e130.
PubMedGoogle Scholar 11.Stroup
DF, Berlin
JA, Morton
SC,
et al; Meta-analysis of Observational Studies in Epidemiology (MOOSE) Group. Meta-analysis of observational studies in epidemiology: a proposal for reporting.
JAMA. 2000;283(15):2008-2012. doi:
10.1001/jama.283.15.2008PubMedGoogle ScholarCrossref 13.Grassi
L, Magnani
K. Psychiatric morbidity and burnout in the medical profession: an Italian study of general practitioners and hospital physicians.
Psychother Psychosom. 2000;69(6):329-334. doi:
10.1159/000012416PubMedGoogle ScholarCrossref 14.Marôco
J, Marôco
AL, Leite
E, Bastos
C, Vazão
MJ, Campos
J. Burnout in Portuguese healthcare professionals: an analysis at the national level [in Portuguese].
Acta Med Port. 2016;29(1):24-30.
PubMedGoogle ScholarCrossref 16.Riquelme
I, Chacón
J-I, Gándara
A-V,
et al; PAINBO Study Group. Prevalence of burnout among pain medicine physicians and its potential effect upon clinical outcomes in patients with oncologic pain or chronic pain of nononcologic origin.
Pain Med. 2018. doi:
10.1093/pm/pnx335PubMedGoogle Scholar 23.Asai
M, Morita
T, Akechi
T,
et al. Burnout and psychiatric morbidity among physicians engaged in end-of-life care for cancer patients: a cross-sectional nationwide survey in Japan.
Psychooncology. 2007;16(5):421-428. doi:
10.1002/pon.1066PubMedGoogle ScholarCrossref 28.Chivato Pérez
T, Campos Andreu
A, Negro Alvarez
JM, Caballero Martínez
F. Professional burnout and work satisfaction in Spanish allergists: analysis of working conditions in the specialty.
J Investig Allergol Clin Immunol. 2011;21(1):13-21.
PubMedGoogle Scholar 30.Escribà-Agüir
V, Pérez-Hoyos
S. Psychological well-being and psychosocial work environment characteristics among emergency medical and nursing staff.
Stress Health. 2007;23(3):153-160. doi:
10.1002/smi.1131Google ScholarCrossref 31.Goehring
C, Bouvier Gallacchi
M, Künzi
B, Bovier
P. Psychosocial and professional characteristics of burnout in Swiss primary care practitioners: a cross-sectional survey.
Swiss Med Wkly. 2005;135(7-8):101-108.
PubMedGoogle Scholar 35.Lesage
F-X, Berjot
S, Altintas
E, Paty
B. Burnout among occupational physicians: a threat to occupational health systems? a nationwide cross-sectional survey.
Ann Occup Hyg. 2013;57(7):913-919. doi:
10.1093/annhyg/met013PubMedGoogle Scholar 38.Nishimura
K, Nakamura
F, Takegami
M,
et al; J-ASPECT Study Group. Cross-sectional survey of workload and burnout among Japanese physicians working in stroke care: the nationwide survey of acute stroke care capacity for proper designation of comprehensive stroke center in Japan (J-ASPECT) study.
Circ Cardiovasc Qual Outcomes. 2014;7(3):414-422. doi:
10.1161/CIRCOUTCOMES.113.000159PubMedGoogle ScholarCrossref 39.O’Kelly
F, Manecksha
RP, Quinlan
DM,
et al. Rates of self-reported “burnout” and causative factors amongst urologists in Ireland and the UK: a comparative cross-sectional study.
BJU Int. 2016;117(2):363-372. doi:
10.1111/bju.13218PubMedGoogle ScholarCrossref 41.Pedersen
AF, Andersen
CM, Olesen
F, Vedsted
P. Risk of burnout in Danish GPs and exploration of factors associated with development of burnout: a two-wave panel study.
Int J Family Med. 2013;2013:603713. doi:
10.1155/2013/603713PubMedGoogle ScholarCrossref 65.Arayago
R, Gonzalez
A, Limongi
M, Guevara
H. Síndrome de burnout en residentes y especialistas de anestesiología.
Salus. 2016;20(1):13-21.
Google Scholar 70.Barros
D de S, Tironi
MOS, Nascimento Sobrinho
CL,
et al. Intensive care unit physicians: socio-demographic profile, working conditions and factors associated with burnout syndrome [in Portuguese].
Rev Bras Ter Intensiva. 2008;20(3):235-240.
PubMedGoogle ScholarCrossref 79.Coleman
M, Dexter
D, Nankivil
N. Factors affecting physician satisfaction and Wisconsin Medical Society strategies to drive change.
WMJ. 2015;114(4):135-142.
PubMedGoogle Scholar 83.Das
S, Barman
S, Datta
S,
et al. Degree of burnout among emergency healthcare workers and factors influencing level of burnout: a pilot study.
Delhi Psychiatry J. 2016;19(1):36-47.
Google Scholar 92.Eelen
S, Bauwens
S, Baillon
C, Distelmans
W, Jacobs
E, Verzelen
A. The prevalence of burnout among oncology professionals: oncologists are at risk of developing burnout.
Psychooncology. 2014;23(12):1415-1422. doi:
10.1002/pon.3579PubMedGoogle ScholarCrossref 109.Hagau
N, Pop
RS. Prevalence of burnout in Romanian anaesthesia and intensive care physicians and associated factors.
J Rom Anest Ter Intensiva. 2012;19:117-124.
Google Scholar 111.Hämmig
O, Brauchli
R, Bauer
GF. Effort-reward and work-life imbalance, general stress and burnout among employees of a large public hospital in Switzerland.
Swiss Med Wkly. 2012;142:w13577. doi:
10.4414/smw.2012.13577PubMedGoogle Scholar 114.Hinami
K, Whelan
CT, Miller
JA, Wolosin
RJ, Wetterneck
TB; Society of Hospital Medicine Career Satisfaction Task Force. Job characteristics, satisfaction, and burnout across hospitalist practice models.
J Hosp Med. 2012;7(5):402-410. doi:
10.1002/jhm.1907PubMedGoogle ScholarCrossref 120.Kase
SM, Waldman
ED, Weintraub
AS. A cross-sectional pilot study of compassion fatigue, burnout, and compassion satisfaction in pediatric palliative care providers in the United States [published online February 5, 2018].
Palliat Support Care. doi:
10.1017/S1478951517001237PubMedGoogle Scholar 122.Kroll
HR, Macaulay
T, Jesse
M. A preliminary survey examining predictors of burnout in pain medicine physicians in the United States.
Pain Physician. 2016;19(5):E689-E696.
PubMedGoogle Scholar 127.Lee
FJ, Stewart
M, Brown
JB. Stress, burnout, and strategies for reducing them: what’s the situation among Canadian family physicians?
Can Fam Physician. 2008;54(2):234-235.
PubMedGoogle Scholar 132.Margaryan
AG. Burnout in primary health care physicians: a pilot study.
New Armen Med J. 2010;4(2):76-79.
Google Scholar 133.Martínez de la Casa Muñoz
A, del Castillo Comas
C, Magaña Loarte
E, Bru Espino
I, Franco Moreno
A, Segura Fragoso
A. Study of the prevalence of burnout in doctors in the health area of Talavera de la Reina [in Spanish].
Aten Primaria. 2003;32(6):343-348.
PubMedGoogle ScholarCrossref 134.Massou
S, Doghmi
N, Belhaj
A,
et al. Enquête sur le syndrome d’épuisement professionnel chez les personnels d’anesthésie réanimation de quatre hôpitaux universitaires marocains.
Ann Medicopsychol Rev Psychiatr. 2013;171(8):538-542. doi:
10.1016/j.amp.2012.02.024Google Scholar 138.Meynaar
IA, van Saase
J, Feberwee
T, Aerts
TM, Bakker
J, Thijsse
W. Burnout among Dutch intensivists—a nationwide survey.
Neth J Crit Care. 2016;24(1):12-17.
Google Scholar 140.Mikalauskas
A, Širvinskas
E, Marchertienė
I,
et al. Burnout among Lithuanian cardiac surgeons and cardiac anesthesiologists.
Medicina (Kaunas). 2012;48(9):478-484.
PubMedGoogle Scholar 152.Rohland
BM, Kruse
GR, Rohrer
JE. Validation of a single-item measure of burnout against the Maslach Burnout Inventory among physicians.
Stress Health. 2004;20(2):75-79. doi:
10.1002/smi.1002Google ScholarCrossref 153.Ruitenburg
MM, Frings-Dresen
MHW, Sluiter
JK. The prevalence of common mental disorders among hospital physicians and their association with self-reported work ability: a cross-sectional study.
BMC Health Serv Res. 2012;12:292-298. doi:
10.1186/1472-6963-12-292PubMedGoogle ScholarCrossref 154.Sadat-Ali
M, Al-Habdan
IM, Al-Dakheel
DA, Shriyan
D. Are orthopedic surgeons prone to burnout?
Saudi Med J. 2005;26(8):1180-1182.
PubMedGoogle Scholar 159.See
KC, Lim
TK, Kua
EH, Phua
J, Chua
GS, Ho
KY. Stress and burnout among physicians: prevalence and risk factors in a Singaporean internal medicine programme.
Ann Acad Med Singapore. 2016;45(10):471-474.
PubMedGoogle Scholar 161.Sharma
A, Sharp
DM, Walker
LG, Monson
JRT. Stress and burnout in colorectal and vascular surgical consultants working in the UK National Health Service.
Psychooncology. 2008;17(6):570-576. doi:
10.1002/pon.1269PubMedGoogle ScholarCrossref 165.Siu
C, Yuen
SK, Cheung
A. Burnout among public doctors in Hong Kong: cross-sectional survey.
Hong Kong Med J. 2012;18(3):186-192.
PubMedGoogle Scholar 172.Surgenor
LJ, Spearing
RL, Horn
J, Beautrais
AL, Mulder
RT, Chen
P. Burnout in hospital-based medical consultants in the New Zealand public health system.
N Z Med J. 2009;122(1300):11-18.
PubMedGoogle Scholar 174.Teixeira
C, Ribeiro
O, Fonseca
AM, Carvalho
AS. Burnout in intensive care units—a consideration of the possible prevalence and frequency of new risk factors: a descriptive correlational multicentre study.
BMC Anesthesiol. 2013;13(1):38. doi:
10.1186/1471-2253-13-38PubMedGoogle ScholarCrossref 176.Travado
L, Grassi
L, Gil
F, Ventura
C, Martins
C; Southern European Psycho-Oncology Study Group. Physician-patient communication among Southern European cancer physicians: the influence of psychosocial orientation and burnout.
Psychooncology. 2005;14(8):661-670. doi:
10.1002/pon.890PubMedGoogle ScholarCrossref 181.Volpe
U, Luciano
M, Palumbo
C, Sampogna
G, Del Vecchio
V, Fiorillo
A. Risk of burnout among early career mental health professionals.
J Psychiatr Ment Health Nurs. 2014;21(9):774-781. doi:
10.1111/jpm.12137PubMedGoogle Scholar 193.Zafar
W, Khan
UR, Siddiqui
SA, Jamali
S, Razzak
JA. Workplace violence and self-reported psychological health: coping with post-traumatic stress, mental distress, and burnout among physicians working in the emergency departments compared to other in Pakistan.
J Emerg Med. 2016;50(1):167-77.e1. doi:
10.1016/j.jemermed.2015.02.049PubMedGoogle ScholarCrossref 199.Malakh-Pines
A, Aronson
E, Kafry
D. Burnout: From Tedium to Personal Growth. New York, NY: Free Press; 1981.
201.Shimotsu
S, Poplau
S, Linzer
M. Validation of a brief clinician survey to reduce clinician burnout. In: Abstracts from the 38th Annual Meeting of the Society of General Internal Medicine.
J Gen Intern Med. 2015:30(suppl 2):S79-S80. doi:
10.1007/s11606-015-3271-0PubMedGoogle Scholar 206.Maslach
C, Jackson
SE, Leiter
MP. Maslach Burnout Inventory Manual. 3rd ed. Menlo Park, CA: Mind Garden Inc; 1996.
209.Schutte
N, Toppinen
S, Kalimo
R, Schaufeli
W. The factorial validity of the Maslach Burnout Inventory–General Survey (MBI-GS) across occupational groups and nations.
J Occup Organ Psychol. 2000;73(1):53-66. doi:
10.1348/096317900166877Google ScholarCrossref 213.Wenger
N, Méan
M, Castioni
J, Marques-Vidal
P, Waeber
G, Garnier
A. Allocation of internal medicine resident time in a Swiss hospital: a time and motion study of day and evening shifts.
Ann Intern Med. 2017;166(8):579-586. doi:
10.7326/M16-2238PubMedGoogle ScholarCrossref 216.Hallsten
L. Burning out: a framework. In: Professional Burnout: Recent Developments in Theory and Research. Philadelphia, PA: Taylor & Francis; 1993. Series in Applied Psychology: Social Issues and Questions.
220.Golembiewski
R, Munzenrider
R, Stevenson
J. Physical symptoms and burn-out phases. In: Moise
LR, ed. Organizational Policy and Development. Louisville, KY: Center for Continuing Education Studies, University of Louisville; 1984:71-86.