Figure. Box-plots (median, interquartile range [boxes], 95% confidence interval [error bars], and outliers [open circles]) of the probabilities attributed to specific words for the presence of a hypothetical disease in a sample consisting of patients, medical students, residents, and practicing physicians (n = 153).
Foppa M, Schneider de Araujo B, Macari A, Reichert R, Goldim JR. Limitations in the Use of Qualitative Terms to Inform Diagnoses. Arch Intern Med. 2011;171(14):1291–1292. doi:10.1001/archinternmed.2011.307
Author Affiliations: Hospital de Clinicas de Porto Alegre, School of Medicine and Cardiology Graduate Program, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil.
The use of qualitative terms to describe the probability of disease is a potential source of misunderstanding and inaccuracy,1- 6 and the use of probabilities has been a main supportive tool to deal with uncertainty in evidence-based diagnosis. Considering this, we have investigated how patients, medical students, and physicians quantify in probabilities the meaning of common terms used to indicate the presence of a disease.
In a public teaching hospital, volunteers who consented were invited to fill in a form marking in a metric rule (0% to 100%) the probability they would attribute to having a hypothetical medical condition for each of a series of randomly ordered words that represent probabilities (eg, “never,” “almost never,” “possible,” “likely”). These were checked by back translation to English. Additional covariates data were collected. Comparisons among subgroups were tested using the t test or analysis of variance and appropriate nonparametric tests. The survey was approved by a research ethics committee.
During a period of 90 days, 167 participants (mean [SD] age, 36  years; 52% male) were interviewed: 45 patients, 44 medical students, 41 medical residents, and 37 hospital practicing physicians, all from radiology, cardiology, and internal medicine wards. Of these, 14 patients were not able to adequately make the proposed quantitative transformation to fill in the form and so were excluded from the analysis.
The distribution of probabilities for each word in the valid sample (n = 153) is shown in the Figure. It is noteworthy that while words conveying ideas related to both extremes of probabilities showed narrower ranges of results, those representing intermediate probabilities showed a marked variability among responders. Moreover, no single term covered adequately the range of probabilities between 20% and 50%.
The mean (SD) probability of all answers was lower in the patients subgroup compared with others (45% [11%] vs 49% [4%]; P < .01). Patients’ answers tended to be closer to 50%, ie, they attributed higher probabilities for “never,” “almost never,” and “unlikely,” and lower probabilities for “compatible with,” “likely,” “very likely,” and “certainly” (all P < .05). We found no significant differences when sample was stratified by sex, age, self-attributed health status, patient origin (inpatient/outpatient), or medical specialty.
We found a high degree of variability in the way language is used and interpreted to attribute probabilities, particularly in the intermediate range, potentially affecting health care provider–patient communication. This finding could, in some aspects, correctly represent the range of indeterminate results of diagnostic tests or, in the worst case, show a lack in formal medical diagnosis reasoning in common practice.
Patients' answers tended to be closer to 50% when compared with other groups, which could be inherent to the patient feelings and fears associated with the presence of disease. Furthermore, the very concept of probability of disease was flawed for some of them, representing a real barrier in communication.
Some study limitations should be addressed. Despite the back translation exercise, differences in results among countries and institutions could emerge from native language use and local practices. Subgroup analysis should also be viewed with caution owing to the limited sample size. Unfortunately, we could not go further in additional questions relating to specific cut-offs for each term, multicenter variability, or the reliability of answers.
Although findings such as ours have already been described for decades, no real improvement has been detected yet. We suggest testing a more restrictive categorization for the presence of a clinical condition, such as low, intermediate, and, high probability. This would simplify the interpretation of results for both patients and physicians, as much as it would disclose the importance of indeterminate test results, representing a more appropriate approach in the light of evidence-based medical diagnosis and decision making.
Correspondence: Dr Foppa, Cardiology Division, Hospital de Clinicas de Porto Alegre, Rua Ramiro Barcelos 2350, Room 2061, Porto Alegre 90035-903, Brazil (email@example.com).
Author Contributions: Dr Foppa had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Foppa, Schneider de Araujo, Macari, Reichert, and Goldim. Acquisition of data: Schneider de Araujo, Macari, and Reichert. Analysis and interpretation of data: Foppa. Drafting of the manuscript: Foppa. Critical revision of the manuscript for important intellectual content: Schneider de Araujo, Macari, Reichert, and Goldim. Statistical analysis: Foppa. Obtained funding: Foppa. Administrative, technical, and material support: Schneider de Araujo, Macari, and Reichert. Study supervision: Foppa and Goldim.
Financial Disclosure: None reported.
Funding/Support: This study was supported by an unrestricted grant from FIPE/HCPA (Hospital de Clinicas de Porto Alegre Institutional Research Fund).