Customize your JAMA Network experience by selecting one or more topics from the list below.
Halvorsen PA, Kristiansen IS. Decisions on Drug Therapies by Numbers Needed to Treat: A Randomized Trial. Arch Intern Med. 2005;165(10):1140–1146. doi:10.1001/archinte.165.10.1140
The number needed to treat (NNT) has been promoted as the preferred effect measure when patients and physicians share decision making. Our aim was to explore the impact of the NNT on laypeople’s decisions about preventive drug therapies.
Two thousand subjects were selected for the survey; 1201 (60%) responded for a representative sample of the Norwegian population. Respondents were allocated to scenarios with random combinations of a disease to be prevented, drug treatment costs, and effect size in terms of NNT. They were interviewed about their hypothetical consent to the therapy, then randomized to different interpretations of NNT and asked to reconsider their initial responses.
The proportions consenting varied from 76% when the NNT was 50 to 67% when the NNT was 1600 (P for trend = .06). When faced with the prospect of avoiding lethal disease, stroke, myocardial infarction, or hip fracture, the proportions consenting were 84%, 76%, 68%, and 53%, respectively (P<.01). Across different treatment costs ($37, $68, $162, and $589) the proportions consenting varied from 78% to 61% (P for trend <.01). Twenty-four percent of the respondents changed their decision when informed about how to interpret the NNT, and 93% of those switched from positive to negative decisions, regardless of the magnitude of NNT.
Respondents’ decisions were influenced by the type of disease to be prevented and the cost of the intervention, but not by the effect size in terms of NNT. This suggests that NNT is difficult to understand and that other effect formats should be considered for shared decision making.
Since its introduction in 1988,1 the number needed to treat (NNT) has gained wide acceptance as a cognitively useful effect measure for clinical practice.2 Its popularity is probably based on the belief that the NNT conveys both clinical and statistical significance to physicians and their patients in a single, easily comprehended measure.3,4 Somewhat inconsistently with this belief, emerging empirical evidence suggests that laypeople are insensitive to the magnitude of NNT when making decisions about hypothetical interventions. When presented with different NNTs in the interval from 10 to 400, 80% of the respondents stated that they would accept a drug to prevent heart attacks,5 whereas 60% would accept a drug therapy to protect against hip fracture,6 in both cases irrespective of the magnitude of NNT. However, these previous studies have been criticized on the grounds that the scope of effectiveness was too narrow (NNT at a maximum of 400), that the study samples were not entirely representative, or that respondents were not properly randomized.
The NNT is defined as the inverse value of absolute risk reduction,3 but what does an NNT of 50, for example, really mean? A possible answer is that for every patient who benefits from therapy, 49 patients do not.1,7,8 This interpretation implies that the NNT provides a direct measure of the individual’s likelihood of having benefit from a therapy. This is reasonable for lotterylike interventions in which the events to be prevented occur in a truly random fashion. However, for interventions that postpone adverse events rather that completely prevent them, an NNT of 50 may be consistent with the possibility that several or even all of the 50 patients will have some benefit.5 In that case, an NNT of 50 simply means that adverse events are postponed to such an extent that 1 fewer patient (of 50) has had adverse outcomes at the specific point in time when the NNT was measured. Examples of lotterylike interventions might be the use of seat belts and hip protectors, whereas antihypertensives or lipid-lowering drug therapies seem more like postponing interventions. Somewhere between is the use of bisphosphonates or estrogens to protect against hip fractures. These can be regarded as postponing interventions to the extent that the process of osteoporosis is halted, but because hip fractures usually involve an accidental fall, the lottery aspect is also relevant. Unfortunately, we cannot know if or when an individual will experience an adverse event. Consequently, we can never know what would happen to an individual with and without a preventive drug therapy; therefore, postponements and proportions that benefit from an intervention cannot be observed directly.5 Valid interpretations of NNTs for different types of interventions thus seem to be a matter of judgment.
Although it seems important that clinicians be aware of different interpretations of NNTs, the issue has attracted little attention in the medical literature and its significance for communicating the benefits of therapy to potential patients has hardly been studied empirically. The objective of this study was to explore whether the previously observed insensitivity to the magnitude of NNT could be extended to a broader range of NNTs and reproduced for different diseases to be prevented, treatment costs, self-reported risk factors, and different interpretations of the NNT.
Statistics Norway, Oslo, regularly performs surveys of the Norwegian population to assess living conditions, demographic variables, and people’s habits, attitudes, and opinions. Specific topics of interest are selected by Statistics Norway or chosen by external institutions (eg, research institutions, governmental departments, public or commercial organizations) that purchase survey questions. In addition, each survey collects information on a fixed set of background variables.
As part of their regular population surveys, Statistics Norway invited a random sample of 2000 individuals for a personal interview between May 6 and June 29, 2002. To ensure a representative sample, the Norwegian population was divided into 109 different strata based on geographic and demographic characteristics. One geographic area from each stratum was randomly selected, and individuals were drawn from these areas with a probability proportional to the size of the population within each area. Noninstitutionalized individuals in the group aged 16 to 79 years were eligible, but otherwise there were no eligibility criteria. Written invitations with general information about the survey were mailed a few weeks before the interview, and the respondents were subsequently contacted by telephone for consent. Face-to-face interviews at the respondents’ homes were encouraged, but telephone interviews were allowed. Data were collected by 126 interviewers with special training. There was no pilot study, but a small proportion of the interviewers tested the questionnaire by simulated interviews, which resulted in minor adjustments of some of the questions. The interviewers used portable computers with preprogrammed questionnaires and registered responses electronically during the interview (ie, computer-assisted interviewing).
For our study, the respondents were presented with a hypothetical clinical scenario with the following wording:
Suppose your physician tells you that you have an increased risk of getting disease x. The physician offers you a drug therapy to prevent it. The drug is to be taken daily and has no serious adverse effects. You need to visit your physician twice a year for follow-up, and the drug therapy will cost you $y per year. The physician informs you that to prevent 1 case of disease x, NNT patients must adhere to the drug therapy for 3 years.
The computer was programmed to assign random values to x (type of disease), y (treatment costs), and NNT, and these values were not known to the interviewer until each interview started. Possible values for disease x were hip fracture, myocardial infarction, stroke, or lethal disease. The diseases were chosen to reflect a spectrum of disease severity and diseases for which preventive drug therapies are offered in clinical practice. The yearly treatment costs were set to represent common preventive drug therapies such as aspirin, hydrochlorothiazide, metoprolol, and finally alendronate sodium or simvastatin and could thus take the values of 250, 460, 1100, or 4000 NKr, respectively ($37, $68, $162, or $589, respectively). The NNTs were set at 50, 100, 200, 400, 800, or 1600. Each respondent was thus presented with a random combination of a disease to be prevented, treatment cost, and NNT. They were then asked the following question: “How likely is it that you would choose to take such a drug?” Possible response categories were certainly, probably, probably not, and certainly not. After the initial response, the interviewer first emphasized that it might be difficult to comprehend the effectiveness of the drug and then offered 1 of the following 3 possible interpretations of the NNT: “(NNT − 1) of the treated patients would have no benefit from the treatment,” “it is unknown whether (NNT − 1) would benefit or not,” and finally “most of the treated patients would benefit in terms of a slight postponement of the disease, but after 3 years of therapy only 1 case of the disease would be prevented.” The respondents were then asked the same question about consent to the drug therapy. The choice of the NNT interpretation was made randomly by the computer during the interview. Possible response categories were the same as for the initial question.
We tested the hypotheses that increasing the NNT will reduce the proportion consenting to therapy; that the association between the magnitude of NNT and consent to therapy, if any, is dependent on the type of disease to be prevented, treatment costs, or the presence of self-reported risk factors; and that change in preference for the drug therapy, if any, is dependent on the kind of NNT interpretation provided. The primary outcome was the individuals’ stated consent to therapy. Consent was prespecified as present if respondents answered certainly or probably, and it was absent if the response was otherwise. The secondary outcome, change in decision about drug therapy, was considered to be present if an initial consent or refusal was withdrawn after an interpretation of NNT was provided. Age, sex, place of residence, educational status, and income were selected as secondary independent variables possibly associated with the outcomes of interest.
We assessed differences between proportions with χ2 tests, including χ2 tests for trend when appropriate. First-order interactions between NNT and the other independent variables were tested in multivariate logistic regression models, including NNT, one other variable, and their product term. Also, we used logistic regression analysis to explore the association between consent to therapy and the independent variables. All of these analyses were prespecified in the protocol. Because the number of planned interviews was predetermined by Statistics Norway, no formal power calculation was performed. We used SPSS version 10.0 software (SPSS Inc, Chicago, Ill) for data analysis.
Of the initial study sample (n = 2000), 464 refused to participate before they had any knowledge about our study questions; 215 could not be reached during the field period (May 6 to June 29, 2002); 14 had emigrated or died; and 106 persons could not participate for other reasons. Thus, 1201 individuals (60%) were randomly assigned to the different NNT groups. Most of the respondents (66.6%) preferred to be interviewed by telephone rather than face-to-face. The proportion female was 48%. Compared with the general population aged 16 to 79 years, the group of individuals 67 years or older was underrepresented (9.6% in the net sample vs 11%), whereas the group aged 25 to 44 years was overrepresented (43% vs 39%).9 Also, the central part of Norway, including the capital city of Oslo, was underrepresented (20% vs 22%), whereas the southwest part of Norway was slightly overrepresented (15% vs 14%).9 Twenty-three respondents did not answer the initial question about consent to the drug therapy, whereas 28 refused to answer this question after an interpretation of NNT was provided. For unknown reasons, 1 respondent was not allocated to any of the interpretations of NNT after the initial question about consent to drug therapy. Participants with missing responses were excluded from the analysis. There were no major imbalances between the different NNT groups (Table 1).
The proportion consenting to the drug therapy was greater when the disease to be prevented was more serious, when the treatment costs were lower, or when at least 1 self-reported risk factor was present (Table 2). This was the case before and after an interpretation of the NNT was provided. A weak, nonsignificant trend toward decreasing consent to therapy with increasing NNT was observed (76%, 71%, 70%, 71%, 68%, and 67% for NNTs of 50, 100, 200, 400, 800, and 1600, respectively; χ2 test for trend, 3.5; P = .06). After the interpretation of NNT was given, the overall proportion consenting to therapy declined from 70% to 49%, and the trend toward lower consent with increasing NNT disappeared (χ2 test for trend, 0.4; P = .54). Of 1172 respondents, 282 (24%) changed their opinion about the drug therapy when provided with an interpretation of the NNT. Two hundred sixty-three (93%) of 282 withdrew their initial consent, whereas 19 (7%) of 282 changed their decision in the opposite direction.
In logistic regression analysis, male sex was an additional significant predictor of consent to therapy (Table 3). No significant interactions between NNT and the other independent variables were detected; consent to therapy by NNT in different subgroups of treatment costs and diseases to be prevented is provided in Table 4. The magnitude and interpretation of the NNT were not significant predictors for changing opinion about the drug therapy (Table 3).
When considering long-term preventive drug therapies, respondents were insensitive to the magnitude of NNT, even after they were informed about its interpretation, whereas they were sensitive to the type of adverse events and treatment costs. These findings suggest that laypeople have difficulties in understanding the concept of NNT. Although the general population may not be entirely representative of patients, clinicians should probably observe that information about effect measures solely in terms of NNTs may have limited impact on patients’ decisions.
The statistical10-12 and clinical1,3,4,13 properties of the NNT have been extensively described and debated on a theoretical basis, but empirical evidence from clinical practice is sparse. Fahey et al14 compared the use of NNT to absolute risk reduction in a clinical guideline for cardiovascular risk management, but no effect on short-term patient surrogate end points was detected. Other empirical evidence stems from surveys of laypeople,5 patients,15 physicians,7,16-18 and health administrators.19 A consistent finding is the lower proportion of consent to therapy when treatment effects are presented as NNT or absolute risk reduction rather than relative risk reduction. Insensitivity to the magnitude of NNT is present among laypeople,5 but not among physicians.7 The present study adds to and extends the evidence of laypeople’s low sensitivity to the magnitude of NNT, which is reproduced across different diseases, treatment costs, and interpretations of NNT. Similar findings have recently been demonstrated across different adverse effects of preventive drug therapies.20 To our knowledge, there are no similar studies of patients facing real decisions.
Works in the field of cognitive psychology have emphasized heuristics, ie, techniques people use to simplify complex decisions. The availability heuristic implies that people tend to overemphasize issues that are easily brought to mind.21 When exposed to affect-rich outcomes, people tend to be sensitive to deviations from probabilities of zero and one, but insensitive to nonzero probabilities,22 ie, affect heuristic. Such heuristics might explain why our respondents were sensitive to diseases and costs but not to NNT.
Another important issue seems to be the basic skills of laypeople with numbers (numeracy), which may be quite poor even among well-educated people.23 Positive correlation between numeracy and accuracy of risk perception has been shown.23,24 A survey of medical students demonstrated a high proportion of good numeracy, yet only 25% of them interpreted the NNT correctly compared with 75% for other risk-reduction formats.24 Among patients at a university internal medicine clinic, only 7% made accurate risk estimates on the basis of NNT.25 These works represent a more direct approach to the assessment of people’s understanding of NNTs.
We acknowledge that our study has several limitations. First, the study design did not directly address people’s comprehension of the meaning of NNT, leaving open the possibility that people were unwilling rather than unable to make use of NNTs in their decisions. Unfortunately, we did not include measures of numeracy and literacy as possible covariates. Because respondents were randomized to different NNTs, it is unlikely that such factors confounded our results. Others have shown that, when asked directly, substantial proportions of laypeople are uncertain about how to interpret NNTs20 and that their numerical understanding of NNTs is poor.24,25 Our study added an attempt to explain NNTs to the respondents. In general they became more skeptical to the intervention in question but did not give more weight to the magnitude of NNTs in their decisions.
About 40% of the initial sample did not participate in this study. There were only minor differences, however, between the responders and the general Norwegian population regarding age, sex, and place of residence. Thus, we believe that our respondents are fairly representative of the general population. About 2 of every 3 respondents preferred to be interviewed by telephone. As a consequence, the interviewers could not rely on such factors as eye contact and facial expressions as possible cues to poor understanding of the questions. However, the decision to be interviewed by telephone or at home was made before randomization to the different scenarios. Therefore the mode of interview should not bias our results. Because our finding of insensitivity to NNTs essentially was a null result, formal calculation of sample size in advance would have been helpful in interpreting the results. The confidence intervals (Table 3 and Table 4), however, indicate that NNTs had at most a very modest effect on the respondents’ decisions. The scenarios we used were not extensively tested in pilot studies. Important cues present in all the scenarios might thus have influenced the responses so as to mask the real effect of NNT, eg, that the drug therapy was proposed by the physician or that adverse effects were not specified beyond the notion that they were not serious. The high proportion consenting could thus reflect laypeople’s trust in their physicians21 or the perception that there was not much to lose. However, because the respondents discriminated between different diseases and treatment costs, we find it unlikely that such factors can explain the insensitivity to the magnitude of NNT.
In our opinion, the body of empirical evidence suggests limited ability rather than limited willingness to make use of NNTs. Previous studies have shown that patients may be more able to understand risks in terms of natural frequencies or visual risk representations than in terms of probabilities or percentages such as absolute and relative risks.26 Expressing treatment effects in terms of natural frequencies might thus be a better option. When the benefit of an intervention is judged to be in terms of postponement rather than complete prevention, informing people directly about these postponements might be a promising strategy. One might say, eg, “On average, this drug therapy postpones heart attacks by x months.” Emerging empirical evidence indicates that laypeople are more sensitive to such effect measures than to NNTs.6,27
Notwithstanding the limitations, in this study laypeople gave almost no weight to effect size in terms of NNTs when considering long-term preventive drug therapies. In this context, the NNT may have limited value as a communication tool. Therefore, clinicians may do well to use NNT with caution when informing patients about the benefits of medical interventions.
Correspondence: Peder A. Halvorsen, MD, Svartaksveien 15, 9516 Alta, N-Norway (firstname.lastname@example.org).
Accepted for Publication: December 31, 2004.
Financial Disclosure: None.
Funding/Support: This study was supported by governmental funds held by the University of Tromsø, Tromsø, Norway, and dedicated to medical research in Northern Norway.
Previous Presentation: This study was presented at the 13th Nordic Congress of General Practice; September 2, 2003; Helsinki, Finland; and at the 9th Biennial Conference of the European Society for Medical Decision Making; June 8, 2004; Rotterdam, the Netherlands.