The categories of factors contributing to diagnostic error in 100 patients.
Mark L. Graber, Nancy Franklin, Ruthanna Gordon. Diagnostic Error in Internal Medicine. Arch Intern Med. 2005;165(13):1493–1499. doi:10.1001/archinte.165.13.1493
The goal of this study was to determine the relative contribution of system-related and cognitive components to diagnostic error and to develop a comprehensive working taxonomy.
One hundred cases of diagnostic error involving internists were identified through autopsy discrepancies, quality assurance activities, and voluntary reports. Each case was evaluated to identify system-related and cognitive factors underlying error using record reviews and, if possible, provider interviews.
Ninety cases involved injury, including 33 deaths. The underlying contributions to error fell into 3 natural categories: “no fault,” system-related, and cognitive. Seven cases reflected no-fault errors alone. In the remaining 93 cases, we identified 548 different system-related or cognitive factors (5.9 per case). System-related factors contributed to the diagnostic error in 65% of the cases and cognitive factors in 74%. The most common system-related factors involved problems with policies and procedures, inefficient processes, teamwork, and communication. The most common cognitive problems involved faulty synthesis. Premature closure, ie, the failure to continue considering reasonable alternatives after an initial diagnosis was reached, was the single most common cause. Other common causes included faulty context generation, misjudging the salience of findings, faulty perception, and errors arising from the use of heuristics. Faulty or inadequate knowledge was uncommon.
Diagnostic error is commonly multifactorial in origin, typically involving both system-related and cognitive factors. The results identify the dominant problems that should be targeted for additional research and early reduction; they also further the development of a comprehensive taxonomy for classifying diagnostic errors.
Once we realize that imperfect understanding is the human condition, there is no shame in being wrong, only in failing to correct our mistakes.—George Soros
Once we realize that imperfect understanding is the human condition, there is no shame in being wrong, only in failing to correct our mistakes.—George Soros
In his classic studies of clinical reasoning, Elstein1 estimated the rate of diagnostic error to be approximately 15%, in reasonable agreement with the 10% to 15% error rate determined in autopsy studies.2- 4 Considering the frequency and impact of diagnostic errors, one is struck by how little is known about this type of medical error.5 Data on the types and causes of errors encountered in the practice of internal medicine are scant, and the field lacks both a standardized definition of diagnostic error and a comprehensive taxonomy, although preliminary versions have been proposed.6- 12
According to the Institute of Medicine, the most powerful way to reduce error in medicine is to focus on system-level improvements,13,14 but these interventions are typically discussed in regard to patient treatment issues. The possibility that system-level dysfunction could also contribute to diagnostic errors has received little attention. Typically, diagnostic error is viewed as a cognitive failing.7,15- 17 Diagnosis reflects the clinician’s knowledge, clinical acumen, and problem-solving skills.1,18 In everyday practice, clinicians use expert skills to arrive at a diagnosis, often taking advantage of various mental shortcuts known as heuristics.16,18- 21 These strategies are highly efficient, relatively effortless, and generally accurate, but they are not infallible.
The goal of this study was to clarify the basic etiology of diagnostic errors in internal medicine and to develop a working taxonomy. To understand how these errors arise and how they might be prevented in the future, we systematically examined the etiology of error using root cause analysis to classify both system-related and cognitive components.
Based on a classification used by the Australian Patient Safety Foundation, we defined diagnostic error operationally as a diagnosis that was unintentionally delayed (sufficient information was available earlier), wrong (another diagnosis was made before the correct one), or missed (no diagnosis was ever made), as judged from the eventual appreciation of more definitive information.
Cases of suspected diagnostic error were collected from 5 large academic tertiary care medical centers over 5 years. To obtain a broad sampling of errors, we reviewed all eligible cases from 3 sources:
Performance improvement and risk management coordinators and peer review committees.
Voluntary reports from staff physicians and resident trainees.
Discrepancies between clinical impressions and autopsy findings.
Cases were included if internists (staff specialists or generalists or trainees) were primarily responsible for the diagnosis and if sufficient details about the case and the decision-making process could be obtained to allow analysis. In all cases, details were gathered from a review of the medical record, from fact-finding information obtained in the course of quality assurance activities when available, and in 42 cases, involved practitioners were interviewed, typically within 1 month of error identification. To minimize hindsight bias, reviews of medical records and interviews used a combination of open-ended queries and a root cause checklist developed by the Veterans Health Administration (VHA).22,23 The VHA instrument24,25 identifies specific flaws in the standard dimensions of organizational performance.26,27 and is well suited to exploring system-related factors. We developed the cognitive factors portion of the taxonomy by incorporating and expanding on the categories suggested by Chimowitz et al,11 Kassirer and Kopelman,12 and Bordage.28 These categories differentiate flaws in the clinician’s knowledge and skills, ability to gather data, and ability to synthesize all available information into verifiable hypotheses. Criteria and definitions for each category were refined, and new categories added, as the study progressed.
Case histories were redacted of identifying information and analyzed as a team to the point of consensus by 1 internist and 2 cognitive psychologists to confirm the existence of a diagnostic error and to assign the error type (delayed, wrong, or missed) and both the system-related and the cognitive factors contributing to the error.29 To identify 100 usable cases, 129 cases of suspected error were reviewed, 29 of which were rejected. Definitive confirmation of an error was lacking in 19 cases. In 6 cases, the diagnosis was somewhat delayed but judged to have been made within an acceptable time frame. In 3 cases, the data were inadequate for analysis, and 1 case was rejected because the error reflected an intentional act that violated local policies (wrong diagnosis of hyponatremia from blood drawn above an intravenous line).
Impact was judged by an internist using a VHA scale that multiplies the likelihood of recurrence (1, remote; up to 4, frequent) by the severity of harm (1, minor injury; up to 4, catastrophic injury).24,25 A minor injury with a remote chance of recurrence received an impact score of 1, and a catastrophic event with frequent recurrence received an impact score of 16. Close-call errors were assigned an impact score of 0, and psychological impact was similarly discounted. The relative frequency of error types was compared using the Fisher exact test. The impact scores of different groups were compared by 1-way analysis of variance, and if a significant difference was found, group means were compared by a t test.
The study was approved by the institutional review board(s) at the participating institutions. Confidentiality protections were provided by the federal Privacy Act and the Tort Claims Act, by New York State statutes, and by a certificate of confidentiality from the US Department of Health and Human Services.
We analyzed 100 cases of diagnostic error: 57 from quality assurance activities, 33 from voluntary reports, and 10 from autopsy discrepancies. The error was revealed by tissue in 53 cases (19 autopsy specimens and 34 surgical or biopsy specimens), by definitive tests in 44 cases (24 x-ray studies and 20 laboratory investigations), and from pathognomonic clinical findings or procedure results in the remaining 3 cases. The diagnosis was wrong in 38 cases, missed in 34 cases, and delayed in 28 cases.
Ten cases were classified as close calls, and 90 cases involved some degree of harm, including 33 deaths. The clinical impact averaged 3.80 ± 0.28 (mean ± SEM) on the VHA impact scale, indicating substantial levels of harm, on average. The impact score tended to be lower in cases of delayed diagnosis than in cases that were missed or wrong (3.79 ± 0.52 vs 4.76 ± 0.40 and 4.47 ± 0.46; P = .34). Cases with solely cognitive factors or with mixed cognitive and system-related factors had significantly higher impact scores than cases with only system-related factors (4.11 ± 0.46 and 4.27 ± 0.47 vs 2.54 ± 0.55; P = .03 for both comparisons). These 2 effects may be related, as delays were the type of error most likely to result from system-related factors alone.
Our results suggested that diagnostic error in medicine could best be described using a taxonomy that includes no-fault, system-related (Table 1), and cognitive (Table 2) factors.
Masked or unusual presentation of disease
Patient-related error (uncooperative, deceptive)
Technical failure and equipment problems
Faulty data gathering
In 46% of the cases, both system-related and cognitive factors contributed to diagnostic error. Cases involving only cognitive factors (28%) or only system-related factors (19%) were less common, and 7 cases were found to reflect solely no-fault factors, without any other system-related or cognitive factors. Combining the pure and the mixed cases, system-related factors contributed to the diagnostic error in 65% of the 100 cases and cognitive factors contributed in 74% (Figure). Overall, we identified 228 system-related factors and 320 cognitive factors, averaging 5.9 per case.
The relative frequency of both system-related and cognitive factors varied with the source of the case and the type of error involved. Cases identified from quality assurance reports and from voluntary reports had a similar prevalence of system-related factors (72% and 76%, respectively) and cognitive factors (65% and 85%, respectively). In contrast, cases identified from autopsy discrepancies involved cognitive factors 90% of the time (P>.50) and system-related factors only 10% of the time (P<.001). Cases of delayed diagnosis had relatively more system-related errors (89%) and fewer cognitive errors (36%) on average, and cases of wrong diagnosis involved more cognitive errors (92%) and fewer system-related errors (50%, P<.01 for both pairs).
No-fault factors were identified in 44 of the 100 cases and constituted the sole explanation in 7 cases. Eleven of these cases involved patient-related factors, including 2 instances of deception (surreptitious self-injection of saliva, mimicking sepsis and denial of high-risk sexual activity, which delayed diagnosis of Pneumocystis carinii pneumonia) and 9 cases involving delayed diagnoses related to missed appointments or instances in which patient statements were unintentionally misleading or incomplete. By far, the most common no-fault factor was an atypical or masked disease presentation, encountered in 33 cases.
In 65 cases, system-related factors contributed to diagnostic error (Table 1). The vast majority of these (215 instances) were related to organizational problems, and a small fraction (13 instances) involved technical and equipment problems. The factors encountered most often related to policies and procedures, inefficient processes, and difficulty with teamwork and communication, especially communication of test results. Many error types were encountered more than twice in the same institution, an event we referred to as clustering.
We identified 320 cognitive factors in 74 cases (Table 2). The most common category of factors was faulty synthesis (264 instances), or flawed processing of the available information. Faulty data gathering was identified in 45 instances. Inadequate or faulty knowledge or skills were identified in only 11 instances.
Inadequate knowledge was identified in only 4 cases, each concerning a rare condition: (1) a case of missed Fournier gangrene; (2) a missed diagnosis of calciphylaxis in a patient undergoing dialysis with normal levels of serum calcium and phosphorus; (3) a case of chronic thrombotic thrombocytopenic purpura; and (4) a wrong diagnosis of disseminated intravascular coagulation in a patient ultimately thought to have clopidogrel-associated thrombotic thrombocytopenic purpura. The 7 cases involving inadequate skills involved misinterpretations of x-ray studies and electrocardiograms by nonexperts.
The dominant cause of error in the faulty data-gathering category lay in the subcategory of ineffective, incomplete, or faulty workup (24 instances). For example, the diagnosis of subdural hematoma was missed in a patient who was seen after a motor vehicle crash because the physical examination was incomplete. Problems with ordering the appropriate tests and interpreting test results were also common in this group.
Faulty information synthesis, which includes a wide range of factors, was the most common cause of cognitive-based errors. The single most common phenomenon was premature closure: the tendency to stop considering other possibilities after reaching a diagnosis. Other common synthesis factors included faulty context generation, misjudging the salience of a finding, faulty perception, and failed use of heuristics. Faulty context generation and misjudging the salience of a finding often occurred in the same case (15 of 25 instances). Perceptual failures most commonly involved incorrect readings of x-ray studies by internists and emergency department staff before official reading by a radiologist. Of the 23 instances related to heuristics, 14 reflected the bias to assume that all findings were related to a single cause when a patient actually had more than 1 condition. In 7 cases, the most common condition was chosen as the likely diagnosis, although a less common condition was responsible.
Cognitive and system-related factors were found to often co-occur, and these factors may have led, directly or indirectly, to each other. For example, a mistake relatively early on (eg, an inadequate history or physical examination) is likely to lead to subsequent mistakes (eg, in interpreting test results, considering appropriate candidate diagnoses, or calling in appropriate specialists). We examined the patterns of factors identified in these 100 cases to identify clusters of cognitive factors that tended to co-occur. Using Pearson r tests and correcting for the use of multiple pairwise analyses, we found several such clusters of cognitive factors. The more common clusters of 3 factors, all of which have significant pairwise correlations within a cluster, were as follows:
Incomplete/faulty history and physical examination; failure to consider the correct candidate diagnosis; and premature closure
Incomplete/excessive data gathering; bias toward a single explanation; and premature closure
Underestimating the usefulness of a finding; premature closure; and failure to consult
In classifying the underlying factors contributing to error, 3 natural categories emerged: no fault, system-related, and cognitive. This classification validates the cognitive and no-fault distinctions described by Chimowitz et al,11 Kassirer and Kopelman,12 and Bordage28 and adds a third major category of system-level factors.
A second objective was to assess the relative contributions of system-related and cognitive root cause factors. The results allow 3 major conclusions regarding diagnostic error in internal medicine settings.
Excluding the 7 cases of pure no-fault error, we identified an average of 5.9 factors contributing to error in each case. Reason’s6 “Swiss cheese” model of error suggests that harm results from multiple breakdowns in the series of barriers that normally prevent injury. This phenomenon was identified in many of our cases, in which the ultimate diagnostic failure involved separate factors at multiple levels of both the system-related and the cognitive pathways.
A second reason for encountering multiple factors in a single case is the tendency for one type of error to lead to another. For example, a patient with retrosternal and upper epigastric pain was given a diagnosis of myocardial infarction on the basis of new Q waves in his electrocardiogram and elevated levels of troponin. The clinicians missed a coexisting perforated ulcer, illustrating that if a case is viewed in the wrong context, clinicians may miss relevant clues and may not consider the correct diagnosis.
System-related factors were identified in 65% of cases. This finding supports a previous study linking diagnostic errors to system issues30 but contrasts with the prevailing belief that diagnostic errors overwhelmingly reflect defective cognition. The system flaws identified in our study reflected far more organizational issues than technical problems. Errors related to suboptimal supervision of trainees occurred, but uncommonly.
Faulty data gathering was much less commonly encountered, and defective knowledge was rare. These results are consistent with conclusions from earlier studies28,30 and from autopsy data almost 50 years ago: “ . . . mistakes were due not so much to lack of knowledge of factual data as to certain deficiencies of approach and judgment.”31 This finding may distinguish medical diagnosis from other types of expert decision making, in which knowledge deficits are more commonly encountered as the cause of error.32
As predicted by other authors,1,9,15,33 premature closure was encountered more commonly than any other type of cognitive error. Simon34 described the initial stages of problem solving as a search for an explanation that best fits the known facts, at which point one stops searching for additional explanations, a process he termed satisficing. Experienced clinicians are as likely as more junior colleagues to exhibit premature closure,15 and elderly physicians may be particularly predisposed.35
This study has a variety of limitations that restrict the generality of the conclusions. First, because the types of error are dependent on the source of the cases,36- 38 a different spectrum of case types would be expected outside internal medicine. Also, selection bias might be expected in cases that are reported voluntarily. Distortions could also result from our nonrandomized method of case selection if they are not representative of the errors that actually occurred.
A second limitation is the difficulty in discerning exactly how a given diagnosis was reached. Clinical reasoning is hidden from direct examination, and may be just as mysterious to the clinician involved. A related problem is our limited ability to identify other factors that likely affect many clinical decision-making situations, such as stress, fatigue, and distractions. Clinicians had difficulty recalling such factors, which undoubtedly existed. Their recollections might also be distorted because of the unavoidable lag time between the experience and the interview, and by their knowledge of the clinical outcomes.
A third weakness is the subjective assignment of root causes. The field as it evolves will benefit from further clarification and standardization for each of these causes. A final concern is the inevitable bias that is introduced in a retrospective analysis in which the outcomes are known.23,39,40 With this in mind, we did not attempt to evaluate the appropriateness of care or the preventability of adverse events, judgments that are highly sensitive to hindsight bias.
Although diagnostic error can never be eliminated,41 our results identify the common causes of diagnostic error in medicine, ideal targets for future efforts to reduce the incidence of these errors. The high prevalence of system-related factors offers the opportunity to reduce diagnostic errors if health care institutions accept the responsibility of addressing these factors. For example, errors could be avoided if radiologists were reliably available to interpret x-ray studies and if abnormal test results were reliably communicated. Institutions should be especially sensitive to clusters of errors of the same type. Although these institutions may simply excel at error detection, clustering could also indicate misdirected resources or a culture of tolerating suboptimal performance.
Devising strategies for reducing cognitive error is a more complex problem. Our study suggests that internists generally have sufficient medical knowledge and that errors of clinical reasoning overwhelmingly reflect inappropriate cognitive processing and/or poor skills in monitoring one’s own cognitive processes (metacognition).42 Croskerry43 and others44 have argued that clinicians who are oriented to the common pitfalls of clinical reasoning would be better able to avoid them. High-fidelity simulations may be one way to provide this training.45,46 Elstein1 has suggested the value of compiling a complete differential diagnosis to combat the tendency to premature closure, the most common cognitive factor we identified. A complementary strategy for considering alternatives involves the technique of prospective hindsight: the crystal ball experience: The clinician would be told to assume that his or her working diagnosis is incorrect, and asked, “What alternatives should be considered?”47 A final strategy is to augment a clinician’s inherent metacognitive skills by using expert systems, an approach currently under active research and development.48- 50
Correspondence: Mark L. Graber, MD, Medical Service 111, Veterans Affairs Medical Center, Northport, NY 11768 (firstname.lastname@example.org).
Accepted for Publication: February 21, 2005.
Financial Disclosure: None.
Funding/Support: This work was supported by a research support grant honoring James S. Todd, MD, from the National Patient Safety Foundation, North Adams, Mass.
Acknowledgment: We thank Grace Garey and Kathy Kessel for their assistance with the manuscript and references.