Customize your JAMA Network experience by selecting one or more topics from the list below.
Shojania KG, Burton EC, McDonald KM, Goldman L. Changes in Rates of Autopsy-Detected Diagnostic Errors Over Time: A Systematic Review. JAMA. 2003;289(21):2849–2856. doi:10.1001/jama.289.21.2849
Author Affiliations: Department of Medicine, University of California, San Francisco (Drs Shojania and Goldman); Department of Pathology and Laboratory Medicine, Baylor Health Care System, Dallas-Fort Worth, Tex (Dr Burton); and Center for Primary Care and Outcomes Research, Stanford University, Stanford, Calif (Ms McDonald).
Context Substantial discrepanies exist between clinical diagnoses and findings
at autopsy. Autopsy may be used as a tool for quality management to analyze
Objective To determine the rate at which autopsies detect important, clinically
missed diagnoses, and the extent to which this rate has changed over time.
Data Sources A systematic literature search for English-language articles available
on MEDLINE from 1966 to April 2002, using the search terms autopsy, postmortem changes, post-mortem, postmortem, necropsy, and posthumous, identified 45 studies
reporting 53 distinct autopsy series meeting prospectively defined criteria.
Reference lists were reviewed to identify additional studies, and the final
bibliography was distributed to experts in the field to identify missing or
Study Selection Included studies reported clinically missed diagnoses involving a primary
cause of death (major errors), with the most serious being those likely to
have affected patient outcome (class I errors).
Data Extraction Logistic regression was performed using data from 53 distinct autopsy
series over a 40-year period and adjusting for the effects of changes in autopsy
rates, country, case mix (general autopsies; adult medical; adult intensive
care; adult or pediatric surgery; general pediatrics or pediatric inpatients;
neonatal or pediatric intensive care; and other autopsy), and important methodological
features of the primary studies.
Data Synthesis Of 53 autopsy series identified, 42 reported major errors and 37 reported
class I errors. Twenty-six autopsy series reported both major and class I
error rates. The median error rate was 23.5% (range, 4.1%-49.8%) for major
errors and 9.0% (range, 0%-20.7%) for class I errors. Analyses of diagnostic
error rates adjusting for the effects of case mix, country, and autopsy rate
yielded relative decreases per decade of 19.4% (95% confidence interval [CI],
1.8%-33.8%) for major errors and 33.4% (95% [CI], 8.4%-51.6%) for class I
errors. Despite these decreases, we estimated that a contemporary US institution
(based on autopsy rates ranging from 100% [the extrapolated extreme at which
clinical selection is eliminated] to 5% [roughly the national average]), could
observe a major error rate from 8.4% to 24.4% and a class I error rate from
4.1% to 6.7%.
Conclusion The possibility that a given autopsy will reveal important unsuspected
diagnoses has decreased over time, but remains sufficiently high that encouraging
ongoing use of the autopsy appears warranted.
Beginning in 19121 and continuing to
have documented substantial discrepancies between clinical diagnoses and findings
at autopsy. Time series from single institutions have examined trends in these
diagnostic discrepancies and, with only 1 exception,7 have
found no significant decreases over time.8-12
A possible explanation for the stability of these rates is increased
selection by clinicians. In 1994, the last year for which national data exist,
the autopsy rate for all nonforensic deaths decreased to less than 6%13 compared with average rates of 30% to 40% in the
1960s.14 With progressively fewer autopsies
performed over time, clinical selection for diagnostically challenging cases
might offset true gains in diagnostic accuracy. However, several prospective
studies have shown clinicians to have little ability to identify cases that
will yield "diagnostic surprises,"15-18 so
clinical selection might exert little effect on rates of autopsy-detected
As part of a broader report on the autopsy as a tool for quality measurement
and improvement,19 we systematically reviewed
the literature to estimate the frequency with which autopsy reveals important,
clinically missed diagnoses. We sought to assess the degree to which this
frequency has changed over time, and the extent to which clinical selection
for diagnostically challenging cases accounts for the substantial error rates
that continue to be reported in autopsy studies.
We searched the MEDLINE database for English-language articles (1966-April
2002) using Medical Subject Heading terms autopsy and postmortem changes, and the title words, autopsy, post-mortem, postmortem, necropsy, and posthumous. We then applied terms capturing aspects of study design (eg, epidemiologic studies, clinical trials) and topics relating
to diagnosis (eg, diagnostic errors, diagnostic techniques
and procedures) or error (eg, medical error, iatrogenic disease, safety). Reference
lists from all relevant articles were reviewed to identify additional studies,
and the final bibliography was distributed to experts in the field to identify
missing or unpublished studies.
Included studies met the following criteria:
1. Consecutive autopsies with well-defined selection criteria (eg, all
adults dying after hospital arrival and undergoing autopsy during a specified
period) or random samples from such series; "convenience samples" and consecutive
series missing more than 20% of eligible cases were excluded.
2. Clinical diagnoses derived from autopsy request forms submitted by
clinicians or chart review performed by study investigators; assessments of
clinical diagnoses based primarily on death certificates were excluded.
3. Classification of autopsy-detected errors in clinical diagnoses according
to generally accepted classification schemes8,20;
major errors defined as clinically missed diagnoses involving a principal
underlying disease or primary cause of death; and class I errors, major errors
that, had they been detected during life, "would," "could," "possibly," or
"might" have affected patient prognosis or outcome (at a minimum, discharge
from the hospital alive). Studies that made no distinction between changes
in management and changes in outcome were deemed to be reporting major errors
Studies reporting autopsy data from multiple institutions or observation
periods were analyzed as separate series whenever possible. We did not restrict
our review to studies of inhospital deaths, although an overwhelming majority
of studies involved inpatient autopsies predominantly or, in many cases, exclusively.
Diagnostic error rates were modeled using logistic regression analyses
with country, study period, case mix, and autopsy rate as predictors and including
a random study effect.21 Hospital teaching
status was not included as a predictor, because too few studies involved nonteaching
hospitals and because the nature of the teaching status was often unclear.
Autopsy rates and time were modeled as continuous variables, with the
value for time designated as the midpoint of the study period. Country was
simplified to United States or non-United States, but case mix was modeled
as a nonordinal variable with the categories of (1) general autopsies, (2)
adult medical, (3) adult intensive care, (4) adult or pediatric surgery, (5)
general pediatrics or pediatric inpatients, (6) neonatal or pediatric intensive
care, and (7) other. The first category, which constituted the base case mix
in the regression analysis, included series reporting general autopsies (all
ages, specialties, and settings), general inpatients (all ages and specialties),
and general adult inpatients. We combined these 3 populations because many
studies provided insufficient detail to permit reliable distinctions between
these 3 and because the contribution of adult inpatients dominated samples
of all 3 types.
In anticipation of methodological heterogeneity among the studies, we
abstracted each article for key study features plausibly related to observed
error rates. These study features included (1) cohort design (prospective
vs retrospective); (2) clarity of error definition (whether class I and major
errors were defined using illustrative examples or if the results included
a complete listing of all clinical-autopsy discrepancies designated as class
I or major errors); (3) source of clinical diagnoses (chart review vs autopsy
request forms); (4) involvement of clinicians in classifying errors. The regression
models for major and class I error rates incorporated each of these methodological
characteristics as categorical variables.
We identified 45 studies2-8,12,17,18,20,22-55 reporting
a total of 53 distinct autopsy series meeting our inclusion criteria (Table 1a). More studies reported major errors
than reported class I errors (42 and 37 series, respectively). Twenty-six
series reported both types of errors and just over half of the series (27)
involved US institutions.
Although numerous studies met our inclusion criteria and offered a wide
range of predictor variables for the regression model (Table 2), many of the studies exhibited methodological limitations.
The vast majority of autopsy series were assembled retrospectively, and only
half performed chart review to obtain clinical diagnoses. Clinicians played
a primary role in classifying diagnostic errors in two thirds of series (Table 3).
The median major error rate was 23.5%, although rates ranged from 4.1%
to 49.8%, with the upper bound reflecting the only series focused on postoperative
deaths.44 The median class I error rate was
9.0%, but rates ranged from 0% to 20.7%, with the upper bound again corresponding
to the series focused on postoperative deaths.44 The
study reporting zero class I errors involved pediatric deaths from an emergency
department.49 The authors attributed the absence
of class I errors to the high proportion of deaths following cardiac arrest,
in which survival depends predominantly on the adequacy of resuscitation rather
than the accuracy of clinical diagnosis.
Compared to US studies, autopsy series from outside the United States
exhibited a slight, but statistically significant trend toward higher major
error rates (odds ratio [OR], 1.15; 95% confidence interval [CI], 1.01-1.31; P = .03). For class I errors, the effect was of comparable
magnitude and bordered on statistical significance (OR, 1.26; 95% CI, 0.99-1.59; P = .06) (Table 4).
Autopsy rates ranged from 12% to 100% (median, 37.0%). Relative to the
error rate in 1980 (the midpoint of the 40-year period spanned by the included
studies), major errors decreased at a rate of 12.4% (95% CI, 7.0%-17.6%) for
every 10% increase in autopsy rates. Class I errors decreased at a rate of
17.4% (95% CI, 6.6%-27.1%) for every 10% increase in autopsies.
Autopsy series restricted to surgical patients reported significantly
higher rates of both major errors (OR, 2.16; 95% CI, 1.53-3.06) and class
I errors (OR, 3.01; 95% CI, 1.66-5.43). Series limited to adult medical patients
reported higher class I error rates (OR, 1.84; 95% CI, 1.06-3.20); US series
from adult intensive care units also had higher class I error rates (OR, 2.12;
95% CI, 1.42-3.16). Conversely, pediatric series reported significantly lower
rates of major errors, and series involving pediatric or neonatal intensive
care autopsies reported significantly lower class I error rates (OR, 0.56;
95% CI, 0.32-0.98) (Table 4).
None of the 4 methodological features shown in Table 3 significantly affected major error rates, but 2 methodological
features significantly affected class I error rates (Table 4). Studies conducted prospectively reported higher class
I error rates (OR, 1.63; 95% CI, 1.19-2.23), as did studies in which clinicians
played active roles in classifying errors (OR, 2.09; 95% CI, 1.31-3.34).
Adjusting for the effects of country, case mix, and autopsy rates, major
errors significantly decreased over time, with a relative reduction of 19.4%
per decade (95% CI, 1.8%-33.8%). Adjusting for these same factors as well
as the 2 significant methodological features (prospective study design and
clinicians participation in classifying errors), class I error rates also
decreased significantly over time, with a relative reduction of 33.4% per
decade (95% CI, 8.4%-51.6%).
Despite these decreases, we estimated that a contemporary US institution
with an autopsy rate of 5% (roughly the national average13),
could observe a major error rate of 24.4% (95% CI, 18.8%-31.1%) and a class
I error rate of 6.7% (95% CI, 3.8%-11.4%) (Figure 1 and Figure 2,
respectively). With an autopsy rate of 37% (the median rate in the included
studies), major and class I error rates in the same institution would be estimated
as 17.4% (Figure 1) and 5.8% (Figure 2). This autopsy rate is much higher
than the rates of 15% to 20% typically achieved in contemporary teaching hospitals56 and, therefore, has less clinical selection. Even
with extrapolation to an autopsy rate of 100% (to eliminate the effect of
clinical selection completely), a US institution in 2000 would be estimated
to report a major error rate of 8.4% (95% CI, 5.2%-13.1%) and a class I error
rate of 4.1% (95% CI, 1.6%-9.9%) (Figure 1 and Figure 2, respectively).
By analyzing the results of 53 distinct autopsy series over a 40-year
period, we have shown statistically significant decreases over time for major
and class I diagnostic errors detected at autopsy. By contrast, individual
studies comparing rates of autopsy-detected diagnostic errors from different
periods have found strikingly unchanged error rates.8,12,57,58 These
previous results almost certainly reflect inadequate power, as well as the
competing effects of improvements over time and increased clinical selection
as autopsy rates decrease. In fact, the only study with high and nearly equal
autopsy rates in all periods examined showed a significant decrease in major
errors over time.7
The present data suggest that, among the approximately 850 000
individuals dying in US hospitals each year,59,60 a
major diagnosis remains clinically undetected in at least 8.4% of cases (71 400
deaths). The data also suggest that approximately 34 850 of these patients
might have survived to discharge had misdiagnosis not occurred, but this estimate
depends on the accuracy of the designator of class I error. Although, this
second number is more speculative, given the dependence of class I error estimates
on methodological features of the primary studies, it can be considered in
the context of the Institute of Medicine's estimates of 44 000 to 98 000
preventable deaths per year due to medical error.61 These
latter estimates have been debated,62-64 but
the studies from which they were derived may not have detected many of the
errors reported in our analysis.
A major limitation of any systematic review is the possibility of publication
bias. Problems with existing methods of assessing publication bias65,66 are compounded by the opposing directions
in which publication bias might operate. Lack of interest might result in
fewer published reports of low error rates, whereas self-censorship might
reduce reports of high error rates. Regardless of the true net effect of publication
bias on autopsy studies, if this effect has remained reasonably stable, the
observed decrease in published error rates over time would still be meaningful.
Only 5 studies7,31,39,41,52 addressed
the issue of reproducibility for the classification of autopsy-detected diagnostic
errors, and none provided sufficient detail to permit calculation of formal
measures of agreement. The issue of reproducibility is particularly important
for class I errors, as no study used validated criteria to guide reviewers'
judgments about affects on prognosis, which are known to exhibit substantial
Even more fundamental than reproducibility of the error classifications
is the question of the autopsy's characteristics as a diagnostic test. Determining
the sensitivity of any criterion standard, including the autopsy, presents
difficulties. As reviewed in greater detail elsewhere,19 technically
adequate autopsies fail to establish the cause of death in 1% to 5% of cases,
although some studies have reported substantially higher rates of persistent
diagnostic uncertainty after autopsy,43 especially
in perinatal deaths.67-69
Only 1 study70 has assessed agreement
among pathologists in determining principal underlying diseases and causes
of death. Four pathologists independently reviewing 35 autopsies reported
excellent to near perfect agreement for determining the principal disease
(ie, underlying cause of death), with κ values between 0.83 and 0.97
for the different pathologist pairs. For assignments of the immediate cause
of death, however, the pathologists exhibited only moderate to substantial
agreement (κ values ranging from 0.43-0.75).
We used the term error throughout our analysis because of its ubiquitous
presence in the autopsy literature. However, it remains unclear to what extent
clinically missed diagnoses represent errors per se, rather than acceptable
limits of antemortem diagnosis in the face of atypical clinical presentations.
In fact, because the vast majority of autopsy studies come from teaching hospitals,
published autopsy series may be enriched for atypical cases. Nonetheless,
the autopsy has historically helped define how cases that previously appeared
atypical could more commonly be recognized antemortem. Repeated detection
of certain missed diagnoses may result in the recognition that some patterns
of presentation are more typical than previously appreciated.
For many physicians, interest in the autopsy as a means of detecting
clinically missed diagnoses is undoubtedly offset by concerns over litigation.
Only 1 study34 explicitly addressed the question
of whether autopsy findings influence malpractice claims. In this series of
176 autopsies from the University of Pittsburgh Medical Center (Pittsburgh,
Pa) in 1994, follow-up of all cases after the statute of limitations on malpractice
suits had expired identified only 1 malpractice suit. Review of the hospital
record indicated that the intent to proceed to litigation in that case had
become clear prior to the patient's death.
In addition to their intrinsic clinical interest, missed diagnoses detected
at autopsy may have important implications for research. Health services researchers
are accustomed to the problem that administrative databases contain systematic
errors and biases compared with the medical record.71-73 The
data presented here indicate that the medical record itself contains substantial
inaccuracies regarding the principal diagnoses causing or contributing to
death. Since principal diagnoses and causes of death are determined without
autopsy in the vast majority of cases, vital statistics, clinical registries,
and even randomized trials capture incorrect causes of death at rates comparable
with the major error rates in our analysis. These inaccuracies have important
policy implications, as major funding and policy decisions derive in part
from vital statistics and other estimates of disease burden.74-76
Correcting such inaccuracies would not require substantial increases
in autopsies at all hospitals. Perhaps a small group of hospitals funded to
perform autopsies in a high percentage of deaths and according to a uniform
protocol could generate accurate error rates appropriate for correcting the
information contained in routinely generated death certificates and other
epidemiological databases. Data from such a program would also provide the
opportunity to develop an approach to enhancing the selection of autopsies
likely to reveal important unsuspected diagnoses. Explicit selection of autopsy
cases on the basis of diagnostic uncertainty would represent an advance over
current autopsy selection, which is likely determined in large part by patients'
demographic characteristics (especially age77-79)
and by clinicians' comfort in requesting autopsy.80,81 Most
importantly, further research conducted in centers with high autopsy rates
would permit development of strategies for using autopsy findings to improve
subsequent clinical performance.
Create a personal account or sign in to: