Frequency histogram of clinical certainty scores. The managing clinician at the time of presentation gave his or her estimate for the likelihood of acutely destabilized heart failure in dyspneic patients at the end of standard clinical evaluation.
Cumulative hazard rates among dyspneic emergency department patients associated with the presence or absence of diagnostic uncertainty. Results are expressed as a function of 1-year events of mortality (A) and the composite of mortality or hospital representation among all subjects (B) as well as mortality (C) and mortality or hospital representation (D) as a function of diagnosis.
Associations between clinical certainty and 1-year rates of death and death/representation as a function of diagnosis at presentation.
Accuracy of clinical judgment vs amino-terminal pro-B-type natriuretic peptide (NT-proBNP) testing for the diagnosis of acutely destabilized heart failure in dyspneic emergency department patients. Amino-terminal proBNP testing had a superior area under the receiver operating characteristic curve for the diagnosis of acutely destabilized heart failure in those judged with clinical certainty (A) and those judged with clinical uncertainty (B).
Green SM, Martinez-Rumayor A, Gregory SA, Baggish AL, O’Donoghue ML, Green JA, Lewandrowski KB, Januzzi JL. Clinical Uncertainty, Diagnostic Accuracy, and Outcomes in Emergency Department Patients Presenting With Dyspnea. Arch Intern Med. 2008;168(7):741–748. doi:10.1001/archinte.168.7.741
Dyspnea is a common complaint in the emergency department (ED)and may be a diagnostic challenge. We hypothesized that diagnostic uncertainty in this setting is associated with adverse outcomes, and amino-terminal pro-B-type natriuretic peptide (NT-proBNP) testing would improve diagnostic accuracy and reduce diagnostic uncertainty.
A total of 592 dyspneic patients were evaluated from the ProBNP Investigation of Dyspnea in the Emergency Department (PRIDE) study. Managing physicians were asked to provide estimates from 0% to 100%of the likelihood of acutely destabilized heart failure (ADHF). A certainty estimate of either 20% or lower or 80% or higher was classified as clinical certainty, while estimates between 21% and 79% were defined as clinical uncertainty. Associations between clinical uncertainty,hospital length of stay, morbidity, and mortality were examined. The diagnostic value of clinical judgment vs NT-proBNP measurement was compared across categories of clinical certainty.
Clinical uncertainty was present in 185 patients (31%), 103(56%) of whom had ADHF. Patients judged with clinical uncertainty had longer hospital length of stay and increased morbidity and mortality,especially those with ADHF. Receiver operating characteristic analysis of clinical judgment yielded an area under the curve (AUC) of 0.88in the clinical certainty group and 0.76 in the uncertainty group (P < .001); NT-proBNP testing alone in these same groups had AUCs of 0.96 and 0.91, respectively. The combination of clinical judgment with NT-proBNP testing yielded improvements in AUC.
Among dyspneic patients in the ED, clinical uncertainty is associated with increased morbidity and mortality, especially in those with ADHF.The addition of NT-proBNP testing to clinical judgment may reduce diagnostic uncertainty in this setting.
Dyspnea is a common complaint among patients presenting to the emergency department (ED), yet differentiating between the many potential causes of dyspnea is a complicated process. One of the most important causes of dyspnea is acutely destabilized heart failure (ADHF), which is common and associated with a high risk of morbidity and mortality when not detected in a timely fashion.1,2 Accordingly, careful clinical judgment is necessary when evaluating the patient with dyspnea3- 5; however, in this context, correct diagnoses are frequently difficult to ascertain and clinical uncertainty is common. In such cases, supplementing clinical judgment with new diagnostic technologies may reduce this clinical uncertainty.6- 15 Recently, biomarkers such as amino-terminal pro-B type natriuretic peptide (NT-proBNP) and BNP have been shown to improve clinician accuracy for the diagnosis of ADHF in the ED setting.6,7,13 However, the relationship of natriuretic peptide testing and clinical uncertainty has not been fully explored, and the characteristics of patients in whom clinical uncertainty is more likely have not been elucidated.We hypothesized that clinical uncertainty is associated with adverse outcomes and that NT-proBNP testing improves clinical accuracy when clinical uncertainty is present.
Data from the ProBNP Investigation of Dyspnea in the Emergency Department (PRIDE) study were retrospectively reviewed.7 The PRIDE study was a prospective, blinded study of 599 dyspneic subjects presenting to the ED of the Massachusetts General Hospital, Boston, and was performed for the purpose of validation of the diagnostic and prognostic use of NT-proBNP testing (Elecsys ProBNP; Roche Diagnostics, Indianapolis, Indiana). In the PRIDE study,the gold standard for the diagnosis of ADHF was based on the impression of reviewing cardiology physicians, blinded to NT-proBNP values, who had all available clinical information for each subject from presentation through 60 days of follow-up. As reported, 209 subjects (35%) in the PRIDE study were adjudicated to have dyspnea due to ADHF. At the end of 1 year, the managing physician for each patient was contacted for the purposes of ascertainment of vital status and/or rehospitalization rate. Data were complete in 99% of patients.
At the end of a standard clinical evaluation, including full access to any and all diagnostic studies available as standard of care (other than unblinded natriuretic peptide levels), the managing clinicians were asked by a researcher or research assistant to provide an estimate from 0% to 100% of the likelihood for ADHF (a clinical certainty “score,” with a score of 0% representing absolutely no chance of ADHF and 100% representing absolute certainty for the presence of ADHF). Clinical certainty data were available in 99% of subjects.
For the purposes of this analysis, to remain consistent with prior studies,3 a clinician estimate of 20% or lower was classified as “clinician certain the patient does not have ADHF” and 80% or higher as “clinician certain the patient does have ADHF.”3 These 2 groups were considered having high clinical certainty and were categorized as having judgments in the “clinical certainty”range, while those subjects with intermediate clinical certainty scores (between 21% and 79%) were identified as being in the “clinical uncertainty” range.
χ2 Tests were used to compare categorical data between those in the clinical certainty and uncertainty groups, while the Wilcoxon rank sum test was used to compare continuous variables between these groups. Hospital length of stay (LOS) for each group was evaluated using the Mann-Whitney test. Differences in rates of rehospitalization and 1-year mortality were assessed using the log-rank test. Age-adjusted Cox proportional hazards analyses evaluated the impact of clinical uncertainty on the risk for adverse outcomes, including death and rehospitalization, and hazards ratios (HRs) and 95% confidence intervals (CIs) were generated.
Decision statistics were computed from 2 × 2tables and reported as sensitivity, specificity, and positive and negative predictive values. The NT-proBNP levels were analyzed using an age-stratified cutpoint approach of 450 pg/mL, 900 pg/mL, and 1800pg/mL (to convert to nanograms per liter, multiply by 1.0) for ages younger than 50 years, between 50 and 75 years, and older than 75years.15 In addition, the negative predictive value of an age-independent rule-out cutpoint of 300 pg/mL was calculated.15
Receiver operating characteristic (ROC) curves examined the relationship between clinical judgment and the final diagnosis of ADHF by the generation of an area under the curve (AUC). The ROC curves were also used to examine the diagnostic accuracy of NT-proBNP testing.To better understand the potential value of combining NT-proBNP testing with clinical variables, a logistic model was generated that included NT-proBNP testing and clinical variables predictive of ADHF, as previously described.7 The AUC of this model was compared with that of both NT-proBNP testing and of clinical judgment alone.
A 2-sided P value of <.05 was considered statistically significant. The ROC analyses were performed using Analyse-it software (Analyse-it Ltd, Leeds, England); all other statistical analyses were performed using SPSS software (SPSS Inc, Chicago, Illinois).
A total of 592 subjects (99%) had an available “clinical certainty” score, 202 (34%) of whom were judged to have ADHF. The frequency histogram showing the distribution of the clinical certainty scores is depicted in Figure 1. The largest percentage of patients was judged to have a “0% chance” of having ADHF; however, managing clinicians gave a wide range of clinical certainty scores.
Considering subjects as a function of clinical certainty vs uncertainty as previously defined, 407 (69%) had a score of 20% or lower or 80% or higher (defined as in the “clinical certainty” range) and 185 (31%) had a clinical certainty score from 21% to 79%(in the “clinical uncertainty” range).
The demographics and clinical characteristics of subjects divided by the presence or absence of clinical uncertainty are given in Table 1. Patients in whom clinical uncertainty was present were more likely to be older (mean [SD] age, 69  years vs 59  years; P < .001),to have slightly lower left ventricular ejection fractions (mean [SD],55% [17%] vs 58% [16%]; P = .05),and to have atrial fibrillation on presentation (20% vs 9%; P = .01). Notably, 99 (24%) of those patients in the clinical certainty group had ADHF, while 103 (56%)of those in the clinical uncertainty group were found to have ADHF at final adjudication. Further division of patients as a function of uncertainty and final diagnosis is given in Table 2.
Overall, fewer dyspneic patients judged with high clinical certainty (either ≤20% or ≥80% certain for the diagnosis of ADHF) were admitted to the hospital compared with those judged less certainly (71% vs 86%; P < .001). Similarly,among those admitted to the hospital, patients judged with high clinical certainty were found to have a significantly shorter median index hospital LOS (5.4 days; interquartile range [IQR], 2-7 days]) than those judged uncertainly (6.6 days; IQR, 3-9 days [P = .02]). Furthermore, the majority of hospitalizations among those judged confidently were shorter, with 90% of discharges occurring within 9 days or less vs 14 days or less for those in the clinical uncertainty group.
In addition to more hospital use and longer hospital LOS, we found significant associations between clinical uncertainty and adverse outcomes. Compared with those judged with certainty, the clinical uncertainty group had higher rates of mortality and the composite end point of mortality or representation with dyspnea within the first year after index presentation (Figure 2), a finding that was most pronounced among those with acute heart failure (Figure 2 and Figure 3) but was also evident among those without acute heart failure. For the group as a whole, age-adjusted Cox proportional hazards analyses demonstrated uncertainty to be an independent predictor of death (HR, 1.88; 95%CI, 1.02-2.25 [P = .05]) as well as death or rehospitalization (HR, 2.18; 95% CI, 1.71-2.49 [P = .01]) by 1 year.
Overall, patients in the clinical uncertainty group (regardless of final diagnosis) had higher median NT-proBNP concentrations (1336pg/mL [IQR, 242-4860 pg/mL] vs 269 pg/mL [IQR, 60-1714 pg/mL]; P < .001), consistent with the higher prevalence of ADHF in these subjects. Median NT-proBNP concentrations among those with ADHF were higher among those in the clinical certainty group than among those in the clinical uncertainty group (4686 pg/mL [IQR, 1917-12 126 pg/mL] vs 3297 pg/mL [IQR, 1360-9250 pg/mL]; P < .001). Importantly, despite this observation, median values for NT-proBNP were significantly higher in those with ADHF than in other subjects, whether such patients were judged confidently or not (221 pg/mL [IQR, 82-551 pg/mL] vs 118 pg/mL [IQR, 43-393 pg/mL], respectively).
In the “clinical certainty” group (n = 407),308 (76%) were found to have noncardiac dyspnea, and the remainder (n = 99) were diagnosed as having ADHF. The ROC analysis (Figure 4A) demonstrated clinical estimates for ADHF in the clinical certainty group to have an AUC of 0.88 (95% CI, 0.83-0.92) (P < .001).
Of the 308 patients without ADHF in this group, 306 (99%) had a “clinical certainty” score of 20% or lower, with clinicians accurately excluding the diagnosis of ADHF in all but 2 patients.Of the 99 patients with ADHF in this group, only 62 (63%) had a “clinical certainty” score of 80% or higher leading to an accurate diagnosis of ADHF. Therefore, when confident of the presence or absence of the ADHF, clinicians had an overall sensitivity of 63% (95% CI, 54%-68%),a specificity of 99% (95% CI, 95%-100%), a positive predictive value of 69%, and a negative predictive value of 97%.
Comparatively, in subjects judged with high clinical certainty,age-adjusted NT-proBNP cutpoints15 had 92% sensitivity (95% CI, 85%-96%), 86% specificity (95% CI, 82%-90%),and an 80% positive predictive value for diagnosing ADHF, while an age-independent cutpoint of 300 pg/mL for excluding ADHF was found to have a 100% negative predictive value. Receiver operating characteristic curve analysis (Figure 4A) for NT-proBNP in this group of subjects demonstrated an AUC for ADHF of 0.96, (95% CI, 0.94-0.97; P < .001), which was significantly higher than that of clinical estimates for ADHF (P < .001).
The median NT-proBNP concentration of the 62 subjects correctly predicted by clinicians to have ADHF was 4686 pg/mL, while the 37 subjects in whom the clinician missed the diagnosis of ADHF, the NT-proBNP median value was 3849 pg/mL.
In the “clinical uncertainty” group (n = 185),103 (56%) had ADHF. The value of clinical judgment in this setting is depicted in the form of an ROC analysis (Figure 4B), which shows relatively lower AUC for clinical estimates for ADHF, reflecting a significant effect of uncertainty on the diagnostic accuracy of clinical judgment (AUC, 0.76; 95% CI,0.69-0.83 [P < .001]) (P <.001 for difference with the AUC for the “no uncertainty” group).
In these same subjects, NT-proBNP testing had an overall 90% sensitivity (95% CI, 81%-94%), 84% specificity (95% CI, 72%-88%),and a positive predictive value of 86% for the diagnosis of ADHF,while the age-independent cutpoint of less than 300 pg/mL had a 96% negative predictive value. In the ROC analysis in the “clinical uncertainty” group, NT-proBNP testing had a significantly superior AUC compared with clinical judgment for ADHF (AUC, 0.91; 95% CI, 0.87-0.96[P < .001]).
When combining clinical variables predictive of ADHF together with NT-proBNP testing into a logistic model, significant improvement in the AUC was made in the diagnosis of ADHF: when applied to the “no uncertainty” group, the combination of NT-proBNP testing plus clinical judgment yielded an AUC of 0.98 (improved from 0.88 for clinical judgment and 0.96 for NT-proBNP testing alone; P < .05 for the differences), and when applied to the “clinical uncertainty” group, this approach yielded a model with an AUC of 0.94 (improved from 0.76 and 0.91, respectively; P < .05 for both).
Clinical uncertainty for the correct diagnosis occurred among 1 in 3 dyspneic ED patients in our study, even though the patients were evaluated by experienced ED physicians. In this study, we have identified characteristics associated with clinical uncertainty and have established an association between clinical uncertainty and adverse outcomes in these patients. Patients judged uncertainly with respect to the presence or absence of ADHF were more likely to be admitted to the hospital, had a longer index hospital LOS, and had higher rates of 1-year morbidity and mortality, especially in those ultimately diagnosed as having ADHF. Notably, NT-proBNP values were useful for the diagnosis or exclusion of ADHF across all levels of clinical certainty,while clinical judgment—even when clinicians judged patients confidently—was lacking in accuracy, particularly in those subjects in whom clinicians were inclined toward the diagnosis of ADHF. Consistent with the primary results of the PRIDE study7 and the Breathing Not Properly Multinational Study,3 the combination of clinical judgment and natriuretic peptide values provided the highest accuracy for identifying or excluding ADHF, yielding incremental value over either modality alone, even in those judged confidently by the managing clinicians in the ED. These results add further confirmation to the suggestion for universal evaluation of dyspneic patients with natriuretic peptide testing,16 irrespective of diagnostic confidence.
In the ED setting, ADHF is a commonly encountered cause of dyspnea and is thought to account for approximately 1.5 million hospital admissions per year.17- 19 Furthermore, it is thought to be one of the most costly diagnoses in modern medicine, accounting for nearly 60 billion in total health care costs or 6% of all health care expenditures in the United States alone.17- 19 Unfortunately, diagnosis and triage of ADHF still leaves much to be desired both in outcomes and hospital LOS.1,2 A major reason for this is the fact that ADHF may present with protean manifestations, rendering clinical history and physical examination less accurate.
One recent advance in the diagnostic and prognostic evaluation of patients with suspected ADHF is the use of natriuretic peptides as an adjunct to clinical judgment. Randomized and nonrandomized prospective blinded studies suggest that both BNP and NT-proBNP testing are superior to clinical judgment alone without natriuretic peptide testing for the diagnosis, triage, and treatment of patients with ADHF. Furthermore,there are considerable potential cost-savings associated with their use,20,21 which are confirmed in “real-world” implementation studies.22 Despite these advances, the effects of clinical certainty on the outcomes of patients presenting with dyspnea and the potential value of biomarkers such as NT-proBNP in patients with or without clinical certainty have not been fully explored. Our data reveal that there is a clear association between the phenomenon of “clinical uncertainty” and adverse outcomes, which may be more deleterious in those who have ADHF.
Those subjects judged with lower clinical certainty—typically older with medical histories compatible with subtle forms of heart disease—were more likely to have longer hospital LOS (implying ongoing diagnostic uncertainty following ED triage), as well as higher rates of representation, mortality, or the composite of the two. Furthermore,although identified in a retrospective analysis of subgroups within the PRIDE study (which was not necessarily powered for such an analysis),the deleterious association between uncertainty and worse outcomes appeared to be especially marked in patients with a final diagnosis of ADHF, an observation with ramifications given the potential value of NT-proBNP testing in this group. Diagnostically, in those with ADHF in the uncertainty group, NT-proBNP testing alone was superior to clinical judgment alone. However, in this group, as well as in all others, the combination of clinical judgment and NT-proBNP testing was superior to either modality alone for correctly identifying ADHF. These results suggest that NT-proBNP testing added to clinical judgment would reduce clinical uncertainty, increase the likelihood for correct diagnosis in dyspneic subjects, lead to less missed diagnoses of ADHF,and consequently result in superior outcomes to those evaluated without NT-proBNP testing. In addition, this would potentially result in significant financial savings in the process, as predicted in prior analyses.12,20,21
There are many potential reasons why those diagnosed with clinical uncertainty have worse outcomes. These include delays in diagnosis,incorrect therapies, and inappropriate hospital discharge before a correct diagnosis is secured. Furthermore, the increased complexity of those patients judged with uncertainty also in part explains the association between uncertainty and poor outcomes. Unfortunately, the limitations of our database preclude clarity as to the mechanism associating clinical uncertainty and adverse outcome in our subjects.
One might argue that NT-proBNP testing would be less useful for the evaluation of patients judged with clinical certainty for the presence (in this analysis, ≥80% likely) or absence (≤20% likely) of ADHF. Our data argue otherwise, with poor sensitivity for the detection of ADHF in those patients judged with strong clinical certainty. On the other hand, clinical judgment was outstanding for excluding ADHF. These results are very reminiscent of those generated from studies of natriuretic peptide testing in the primary care setting, where clinicians were adept at excluding rather than identifying heart failure.23,24 Considering the inability of clinicians to accurately identify ADHF, the addition of NT-proBNP testing in the group of “highly certain” patients would have improved the accuracy of the detection of ADHF over that of clinically certain judgment, detecting 31% more patients with ADHF (improving the overall detection rate to 94% of all patients with ADHF present in the “clinical certainty” group). In the “uncertainty” group, the addition of NT-proBNP testing to clinical judgment would have detected a similar 30% more subjects with AHDF. This consistent increase in diagnostic accuracy would come at little added cost when the biomarker is used appropriately and without additional morbidity or mortality risk over the standard of care.20,22
There are several important limitations regarding the data we have gathered. The clinical judgment of the skilled ED staff at our urban medical center may not reflect that of other venues; nonetheless, the potential value of NT-proBNP testing was present even in this experienced group of clinicians and would thus be expected to have even more impact in venues with less experience in the evaluation of ADHF. In addition, much of these analyses, including the hospital LOS and mortality data, were gathered in a retrospective manner from the PRIDE study, and our study is thus likely to be underpowered.Despite this, results of randomized trials, decision-analytic framework analyses, and real-world implementation studies argue for the validity of our data.12,20,22
In conclusion, within the context of sound clinical judgment,including an excellent history taking and physical examination and judicious use of adjunctive testing, NT-proBNP testing reduces clinical uncertainty during evaluation of the dyspneic patient, with projected favorable parallel reductions in hospital cost and ultimately improvements in the considerable rates of morbidity and mortality currently seen in these patients.
Correspondence: James L. Januzzi Jr, MD, Massachusetts General Hospital, Yawkey 5984, 55 Fruit St,Boston, MA 02114 (JJanuzzi@Partners.org).
Accepted for Publication: October 17,2007.
Author Contributions: Dr Januzzi had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: S. M. Green, Martinez-Rumayor,Gregory, and Januzzi. Acquisition of data: S. M. Green, Baggish, O’Donoghue, Lewandrowski, and Januzzi. Analysis and interpretation of data: Martinez-Rumayor,Gregory, Baggish, J. A. Green, and Januzzi. Drafting of the manuscript: S. M. Green, Gregory, O’Donoghue,and Januzzi. Critical revision of the manuscript for important intellectual content: S. M. Green, Martinez-Rumayor,Gregory, Baggish, O’Donoghue, J. A. Green, Lewandrowski, and Januzzi. Statistical analysis: Gregory, O’Donoghue,and Januzzi. Obtained funding: Januzzi. Administrative, technical, and material support: Baggish, J. A. Green, Lewandrowski, and Januzzi. Study supervision: S. M. Green, Baggish, and Januzzi.
Financial Disclosure: Dr Januzzi has received grants greater than $20 000 from Roche and Dade Behring and speaking honoraria from Roche, Dade Behring, and Ortho- Clinical diagnostics. Dr Lewandrowski has received grants less than $20 000from Roche and Dade Behring and speaking honoraria from Roche and Dade Behring.
Funding/Support: This study was supported by a grant from the Ed and Maureen Lewi Fund for Cardiology Research,as well as by the Balson Scholar Fund for Cardiovascular Research.
Additional Information: Dr Januzzi is the principal investigator for the NT-ProBNP Investigation of Dyspnea in the Emergency Department (PRIDE) Study.