[Skip to Content]
Sign In
Individual Sign In
Create an Account
Institutional Sign In
OpenAthens Shibboleth
[Skip to Content Landing]
Figure 1.
Percentile distribution of modified Rankin Scale (mRS) scores at day 90 among 312 patients in the tissue plasminogen activator (tPA)–treated group and 312 patients in placebo group, combined data from National Institute of Neurological Disorders and Stroke–Tissue Plasminogen Activator trials 1 and 2.

Percentile distribution of modified Rankin Scale (mRS) scores at day 90 among 312 patients in the tissue plasminogen activator (tPA)–treated group and 312 patients in placebo group, combined data from National Institute of Neurological Disorders and Stroke–Tissue Plasminogen Activator trials 1 and 2.

Figure 2.
Joint outcome distribution tables for model 100–patient population. Outcome under placebo therapy is indicated in rows, under thrombolytic therapy in columns. A, Distribution at start of expert session, with all patients along diagonal in placebo outcome array. B, Distribution at end of one expert's session, with individual patients redistributed to yield thrombolytic therapy outcome distribution. Patients shifted left, in cells shaded green, have improved because of therapy; patients shifted right, in cells shaded red, have worsened because of therapy. For example, values in the modified Rankin Scale (mRS) score row 4 indicate that of 20 patients destined for mRS outcome stratum 4 under placebo therapy, 3 attain mRS outcome stratum 1 with thrombolysis (cell row mRS 4, column mRS 1), 1 attains mRS outcome stratum 2 (cell row mRS 4, column mRS 2), 4 attain mRS outcome stratum 3 (cell row mRS 4, column mRS 3), 11 attain mRS outcome stratum 4 (cell row mRS 4, column mRS 4) and 1 attains mRS outcome stratum 6 (cell row mRS 4, column mRS 6). Adding all left-shifted (green cell) patients indicates that 35 of 100 patients had better outcome as a result of treatment, yielding individual expert estimate for the number needed to treat for benefit of 2.9. Adding all right-shifted (red cell) patients indicates that 4 per 100 patients have worsened because of therapy, yielding an individual expert estimate for the number needed to harm of 25. tPA indicates tissue plasminogen activator.

Joint outcome distribution tables for model 100–patient population. Outcome under placebo therapy is indicated in rows, under thrombolytic therapy in columns. A, Distribution at start of expert session, with all patients along diagonal in placebo outcome array. B, Distribution at end of one expert's session, with individual patients redistributed to yield thrombolytic therapy outcome distribution. Patients shifted left, in cells shaded green, have improved because of therapy; patients shifted right, in cells shaded red, have worsened because of therapy. For example, values in the modified Rankin Scale (mRS) score row 4 indicate that of 20 patients destined for mRS outcome stratum 4 under placebo therapy, 3 attain mRS outcome stratum 1 with thrombolysis (cell row mRS 4, column mRS 1), 1 attains mRS outcome stratum 2 (cell row mRS 4, column mRS 2), 4 attain mRS outcome stratum 3 (cell row mRS 4, column mRS 3), 11 attain mRS outcome stratum 4 (cell row mRS 4, column mRS 4) and 1 attains mRS outcome stratum 6 (cell row mRS 4, column mRS 6). Adding all left-shifted (green cell) patients indicates that 35 of 100 patients had better outcome as a result of treatment, yielding individual expert estimate for the number needed to treat for benefit of 2.9. Adding all right-shifted (red cell) patients indicates that 4 per 100 patients have worsened because of therapy, yielding an individual expert estimate for the number needed to harm of 25. tPA indicates tissue plasminogen activator.

Table 1. 
Modified Rankin Scale
Modified Rankin Scale
Table 2. 
Tissue Plasminogen Activator Under 3 Hours—NNT to Achieve Benefit or Harm
Tissue Plasminogen Activator Under 3 Hours—NNT to Achieve Benefit or Harm
1.
Cook  RJSackett  DL The number needed to treat: a clinically useful measure of treatment effect.  BMJ.1995;310:452-454.PubMed
2.
McAlister  FAStraus  SEGuyatt  GHHaynes  RBfor the Evidence-Based Medicine Working Group Users' guides to the medical literature, XX: integrating research evidence with the care of the individual patient.  JAMA.2000;283:2829-2836.PubMed
3.
Guyatt  GHJuniper  EFWalter  SDGriffith  LEGoldstein  RS Interpreting treatment effects in randomised trials.  BMJ.1998;316:690-693.PubMed
4.
Duncan  PWJorgensen  HSWade  DT Outcome measures in acute stroke trials: a systematic review and some recommendations to improve practice.  Stroke.2000;31:1429-1438.PubMed
5.
Walter  SD Number needed to treat (NNT): estimation of a measure of clinical benefit.  Stat Med.2001;20:3947-3962.PubMed
6.
NINDS rt-PA Stroke Group Tissue plasminogen activator for acute ischemic stroke.  N Engl J Med.1995;333:1581-1587.PubMed
7.
Wardlaw  JMdel Zoppo  GYamaguchi  T Thrombolysis for acute ischaemic stroke [Cochrane Review on CD-ROM].  Oxford, England: Cochrane Library, Update Software; 2000:CD000213.
8.
Samsa  GMatchar  DGoldstein  L  et al Utilities for major stroke: results from a survey of preferences among persons at increased risk for stroke.  Am Heart J.1998;136:703-713.PubMed
9.
Lai  SMDuncan  PW Stroke recovery profile and the modified Rankin assessment.  Neuroepidemiology.2001;20:26-30.PubMed
10.
von Kummer  R Brain hemorrhage after thrombolysis: good or bad?  Stroke.2002;33:1446-1447.PubMed
Original Contribution
July 2004

Number Needed to Treat Estimates Incorporating Effects Over the Entire Range of Clinical OutcomesNovel Derivation Method and Application to Thrombolytic Therapy for Acute Stroke

Author Affiliations

From the Stroke Center and Department of Neurology, UCLA School of Medicine, Los Angeles, Calif.

Arch Neurol. 2004;61(7):1066-1070. doi:10.1001/archneur.61.7.1066
Abstract

Background  Number needed to treat (NNT) is a useful measure of a treatment's clinical benefit or harm. However, NNT estimates for treatments for neurologic conditions have previously been generated only for dichotomized functional outcomes, which may underestimate clinically relevant treatment effects.

Objectives  To develop a method for estimating NNTs for nonbinary outcomes from parallel design clinical trials and to illustrate its application to outcomes of fibrinolytic stroke therapy across the full range of the modified Rankin Scale (mRS) of disability.

Methods  Expert generation of joint distribution outcome tables in a model population affords a novel means to derive NNTs for nonbinary end points. Using mRS distributions from the National Institute of Neurological Disorders and Stroke–Tissue Plasminogen Activator trials, 10 neurologist and emergency physician acute stroke care experts independently specified the joint distribution of outcomes in model samples of 100 patients assigned to placebo and active therapy.

Results  The average estimated NNT for 1 additional patient to have a better outcome by 1 or more grades on the mRS as a result of treatment was 3.1 (95% confidence interval, 2.6-3.6). The estimated number needed to harm was 30.1 (95% confidence interval, 25.1-36.0). Expert estimates were robust across alternative stratifications of the mRS, with the NNT for benefit on 6- and 5-rank versions of 3.3 and 3.7 and the number needed to harm of 56.6 and 100.0, respectively.

Conclusions  Expert generation of joint distribution outcome tables enables NNT estimation across a full spectrum of nonbinary outcomes. For every 100 patients with acute stroke treated with tissue plasminogen activator, approximately 32 have a better final outcome and 3 have a worse final outcome as a result of treatment.

Number needed to treat (NNT) is a widely accepted, statistically valid, and clinically useful measure of treatment effect.1,2 However, methods for deriving NNT estimates were initially described only for binary outcomes. When outcomes are ordinal or continuous, rather than binary, dichotomizing end points reduce outcome information and may lead to underestimation of clinically relevant treatment effects.35

Treatments for brain injury are generally not curative. Rather, successful therapies improve a patient's final functional status along a broad continuum from fully normal through symptomatic but independent, dependent but ambulatory, dependent and nonambulatory, persistently vegetative, and dead. Standard measurements to assess outcome in numerous neurologic conditions assign patients to 1 of several strata in an ordered hierarchy of functional outcomes, including multiple sclerosis (Kurtzke Expanded Disability Status Scale), traumatic brain injury (Glasgow Outcome Scale), and acute stroke (modified Rankin Scale [mRS]). Interest in assessing health-related quality of life has recently further increased the use of nonbinary outcome measures, both within neurology and across a wide range of general medical conditions. To inform clinician and patient decision making, methods for estimating NNTs across the full range of clinically salient outcomes are urgently needed.

Outcome from acute stroke is a prototypical example. Previous analyses of NNT for benefit from treatment with tissue plasminogen activator (tPA) within 3 hours of acute ischemic stroke have been calculated for dichotomized outcomes only. For instance, based on data from the National Institute of Neurological Disorders and Stroke–Tissue Plasminogen Activator (NINDS-tPA) trials,6,7 the NNT for tPA treatment to avert 1 case of dependence or death after stroke, defined as an mRS score of 2 or more, is 8.4. However, these data for a simple, dichotomized mRS end point are likely to underestimate the beneficial effect of tPA treatment, failing to capture more fine-grained, but still clinically meaningful, improvements. Consideration of the full range of mRS outcomes (Table 1) shows that several clinically worthwhile improvements are missed in this analysis, such as achieving no symptoms at all rather than slight disability (mRS score, 0 vs 2), or moderate disability rather than death (mRS score, 3 vs 6).

The objective of this study was to develop a method for estimating NNTs for ordinal outcomes in parallel-group design clinical trials and to use this method to calculate NNTs for benefit and harm from tPA as assessed by treatment-related improvement or worsening at least 1 functional grade on the mRS of global disability.

Methods
GENERAL STRATEGY FOR DERIVING NNTs

Methods to calculate exact NNTs for ordinal or continuous end points have recently been developed.3,5 However, a key variable in the required formulas is the within-patient correlation—the degree to which the rank order of patient outcomes is similar under control vs active therapy. The within-patient correlation is specified precisely by observational data in any paired design, including crossover design clinical trials, thereby allowing exact calculation of NNTs. However, in parallel-group trials, within-patient correlation is not fully specified by study data. Accordingly, estimation of the within-patient correlation must be made or, equivalently, the joint distribution of the outcome score under control vs active therapy must be specified.

Techniques for estimating the within-patient correlation in parallel-group design, randomized, controlled clinical trials have not been previously well developed. This lack has been a major barrier to estimating NNTs for many therapies, as most pivotal clinical trials use a parallel-group design rather than a crossover design. One approach has been to make the simplifying assumption that the within-patient correlation is nil,3 but this assumption is biologically implausible for most neurologic conditions, where the outcome a patient would have receiving placebo is often related to the outcome that patient would have receiving active therapy. Another approach has been to take an observation of within-patient variance available from previous paired or crossover trials in a particular condition and apply it to parallel-group design trials enrolling patients with the same condition.5 This approach is sound when relevant data are available, but it is rare for any preceding paired trial data to be available for many neurologic conditions.

Using disease experts to estimate within-patient correlation is an appealing strategy. Knowledgeable clinicians are familiar from extensive practice experience with numerous individual patient outcomes under control and active therapy. However, translating this experience into an informed estimate of within-patient correlation is not straightforward. It is difficult for experts to simply state as a global judgment an estimated correlation coefficient value for within-patient correlation.

The alternative approach developed in this study is to ask experts to complete a joint distribution table of individual patient outcomes for a model population of 100 patients. Expert population of the joint distribution table automatically specifies the within-patient correlation. The table is completed by iterative redistribution of individual patients from their destined outcomes under control therapy to their destined outcomes under active therapy, judgments that accord with traditional bedside experience.

The NNTs may be calculated straightforwardly from the resulting joint distribution table. For a given joint distribution table of X the score under placebo, and Y, the score under active treatment, the distribution of D = XY, is determined nonparametrically as Pr (D = d) = Σ [Pr (X = j)] [Pr (Y = d-j)] for j equals 0 to d. By definition, NNT = {1/[Pr (X≥d)–Pr (Y≥d)]} = [1/Pr(D≥d)] where Pr(D≥d) is the proportion of the differences greater than or equal to a specified difference d. In this study, d equals 1. This approach does not require X, Y, or D to be continuous or follow any parametric distribution and in this study, X and Y are integers of 0 or larger. Similarly, number needed to harm (NNH) is defined as NNH = [1/Pr(D≤ − d)].

APPLICATION TO INTRAVENOUS tPA STROKE THERAPY

Treatment and placebo outcomes for all mRS strata from NINDS-tPA Study trials 1 and 2 were combined into 1 data set for analysis (Figure 1). Ten neurologist and emergency physician experts in acute stroke care independently specified the joint distribution of outcomes in a model sample of 100 patients assigned to placebo and active therapy. Each panel member was given a spreadsheet (Excel; Microsoft Corp, Seattle, Wash) displaying the following: (1) definitions of each mRS outcome category, (2) the distribution of mRS outcomes in the placebo and tPA treatment groups in the NINDS-tPA studies, rounded to the nearest integer, and (3) the rates of symptomatic intracerebral hemorrhage in the placebo and tPA treatment groups. In the center of the spreadsheet was a joint distribution table of outcomes, initially with all 100 model patients arrayed in placebo distribution cells (Figure 2A). The panel member redistributed patients iteratively to complete the joint distribution table, under the instruction to specify the joint distribution most likely to occur among a typical group of 100 patients who are treated with tPA and who match the NINDS-tPA study population (Figure 2B). Members first filled cells via improved outcomes to achieve the observed NINDS-tPA study distribution, then filled cells to capture worsened outcomes, and then added to cells via improved outcomes to reachieve the target-observed NINDS-tPA study distribution. Members were then asked to globally reexamine all cells and readjust the joint distribution, if needed, to achieve maximum biological plausibility, constrained by maintaining the observed trial group outcomes (the marginal distributions). Each expert's estimates of the proportion of patients per 100 experiencing benefit or harm from tPA treatment compared with placebo across the entire mRS was calculated by adding all off-diagonal cells in the appropriate direction (d≥1 for NNT or d≤ − 1 for NNH) from the expert-specified joint distribution table.

A clear gradient of desirability distinguishes mRS strata 0, 1, 2, 3, and 6. In contrast, utility preference studies suggest a minority of individuals consider a severely disabled outcome from stroke as an equal or even worse a result than death.8,9 Therefore, 2 additional NNT and NNH calculations were performed on the expert-specified joint outcome distribution tables: (1) a 6-rank analysis, collapsing mRS strata 5 and 6 together into a single-worst outcome category, and (2) a 5-rank analysis, collapsing mRS strata 4, 5, and 6 together into a single-worst outcome category.

Number needed to treat and NNH values were obtained from each of the 10 experts. The geometric mean and corresponding 95% confidence interval (CI) for NNT and NNH across these 10 experts were calculated, using the sample standard deviation. (The logarithm NNT and NNH are better modeled as a gaussian distribution because the untransformed NNT and NNH become large as the risk difference gets small.)

Results

The distributions of mRS outcomes in placebo and treatment groups of the NINDS-tPA trials 1 and 2 are shown in Figure 1. The mean (SD) mRS score in the tPA treatment group was 2.66 (2.13) and in the placebo group 3.19 (2.00). The mean (SD) difference in the mRS score was 0.53 (2.92).

Results of NNT to benefit and NNH calculations are given in Table 2. For the full, 7-category mRS, the NNT for 1 additional patient to have a better outcome by 1 or more grades than he or she would have had with placebo was 3.1 (95% CI, 2.6-3.6). This estimate was robust across alternative stratifications of the mRS, with the NNT for benefit on 6- and 5-rank versions of 3.3 and 3.7, respectively. The estimated NNH was 30.1 (95% CI, 25.1-36.0; SD, 9.0). In the alternative 6- and 5-grade stratifications of the mRS, the NNH estimates were 56.6 and 100, respectively.

Comment

Patients with brain disease and their families commonly value a wide range of transitions in outcome states as desirable. They can make the most informed treatment decisions when provided with risk-benefit data regarding therapeutic options that reflect treatment effects across the entire range of outcomes they value. Dichotomizing end points, while computationally convenient, artificially privileges a single transition in outcome states as the only clinically meaningful potential effect of treatment and typically underestimates the true, clinically relevant treatment effect. The method for estimating NNTs for nonbinary end points used in this study is widely applicable to parallel-group design trials evaluating treatments for any medical condition in which key end points are ordinal. Expert panel members without extensive statistical training reported no difficulty in engaging in the redistribution of outcomes task, which required no mathematical formula and was analogous to typical clinical reasoning. Sessions with expert panel members proceeded promptly, generally lasting only 15 to 20 minutes.

The NNT and its inverse, the absolute risk difference, are particularly useful indices of treatment effect, as they express risk and benefit in a manner that accords with natural clinical decision making.1,2 The application to fibrinolytic therapy for acute stroke illustrates the additional perspective afforded by this type of NNT calculation. The full range of outcomes analysis indicated that the expected number of patients with acute stroke needed to treat with tPA to achieve 1 additional beneficial outcome is 3.1, in contrast to an NNT of about 8 in dichotomized analyses. Almost one third of all patients receiving tPA treatment have an improvement in outcome as a result. Clinicians, policy makers, and authors of treatment guidelines should to be aware that prior estimates of NNT for tPA treatment in acute stroke, based on dichotomized outcomes, have substantially underestimated the benefits of this therapy.

Estimates of the NNH, in terms of producing worse final outcome from stroke, have not previously been advanced for fibrinolytic stroke therapy. The primary mechanism by which thrombolytic stroke therapy may cause individual patients to have worse outcomes is hemorrhagic transformation of cerebral infarction. How frequently hemorrhagic transformation alters final outcome, however, has not been explicitly defined by clinical trial data. Most occurrences of hemorrhagic transformation are asymptomatic. Other patients have a mild transient worsening in their neurologic deficit caused by hemorrhagic transformation, but their final functional outcome is unaffected.10 Patients who have a mildly worse final functional outcome as a result of hemorrhagic transformation may not be captured by dichotomized analyses of end points. While 1 in 17 patients in the NINDS-tPA cohort had hemorrhagic transformation temporally associated with some degree of early neurologic worsening, the expert panel judged that the NNH for tPA treatment in acute stroke is 30.1 for the more clinically salient outcome of worse final global disability grade 3 months after stroke.

An advantage of this full range of outcomes analysis is that it allows more direct comparison of the NNT to yield benefit and the NNT to yield harm along the same functional outcome scale. Prior risk-benefit analyses of thrombolytic stroke therapy required clinicians and patients to contrast dissimilar outcome measures. In contrast, the results of the expert panel analysis provide directly comparable benefit and harm indices. For patients matching the populations of the NINDS-tPA trials, the NNT with tPA for 1 patient to have a better global disability outcome is 3.1 and the NNT for 1 patient to have a worse global disability outcome is 30.1. For every 100 patients treated with tPA, approximately 32 will have a better final outcome and 3 a worse final outcome as a result of treatment.

Several precautions were taken in this analysis to ensure that the NNTs calculated were for outcome differences that are clinically salient. The mRS outcome measure was used. As a global measure of disability, the mRS offers the most comprehensive measure of functional outcome among the several outcome measures routinely used in clinical trials of acute stroke. For this reason, it has been frequently used as a primary end point in stroke trials and has been adopted by the Cochrane Collaboration as the most important measure for analysis when performing meta-analyses of results across trials. The mRS assigns patients to 7 broad functional ranks. With extremely fine-grained scales, such as the 42-rank National Institutes of Health Stroke Scale or the 20-rank Barthel Index, differences between adjacent rank outcomes may not be clinically important for the patient or their family. In contrast, differences among the 7 ranks in the mRS have clear and substantial clinical importance. Moreover, to ensure the clinical meaningfulness of the findings, alternative stratifications of the mRS were analyzed, merging strata that a minority of patients does not recognize as differentially desirable, with little resulting alteration in the NNT results.

Approaches to determining NNTs that reflect treatment effects across the entire spectrum of clinically relevant outcomes merit widespread application to neurologic diseases to facilitate more informed decision making by patients, patient families, and physicians. The expert panel method delineated here provides a means to ascertain NNTs from the parallel-group design trials using ordinal measures of outcome that provide the foundation for many therapies in neurologic practice.

Back to top
Article Information

Correspondence: Jeffrey L. Saver, MD, UCLA Stroke Center, 710 Westwood Plaza, Los Angeles, CA 90095 (jsaver@ucla.edu).

Accepted for publication March 3, 2004.

This study was supported in part by award K24 NS 02092-01 from the National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Md.

I thank the NINDS-tPA Study Trialists for making detailed mRS outcome data available for this analysis; Jeffrey Gornbein, PhD, for statistical consultation; and the members of the beta test and final expert panel—Greg Albers, MD; Stanley Cohen, MD; Phil Gorelick, MD (beta test); James Grotta, MD; Steven Levine, MD; David Liebeskind, MD; Helmi Lutsep, MD; Phil Scott, MD; Sidney Starkman, MD; and Janet Wilterdink, MD.

I have received speaking honoraria for talks on acute stroke therapy from Genentech Inc, South San Francisco, Calif (none in the past 3 years); serve on a scientific advisory board on secondary stroke prevention for Boehringer Ingelheim, Ridgefield, Conn; have served as a site investigator in National Institutes of Health–funded trials of fibrinolysis for which Genentech Inc supplied study agent; have served as a site investigator in nonfibrinolytic trials sponsored by Boehringer Ingelheim; and have served as a medicolegal expert on acute stroke care.

An Excel and a Word (Microsoft Inc) file containing a more detailed, step-by-step example of the process of expert specification of a joint distribution table of outcomes is available from me on request.

References
1.
Cook  RJSackett  DL The number needed to treat: a clinically useful measure of treatment effect.  BMJ.1995;310:452-454.PubMed
2.
McAlister  FAStraus  SEGuyatt  GHHaynes  RBfor the Evidence-Based Medicine Working Group Users' guides to the medical literature, XX: integrating research evidence with the care of the individual patient.  JAMA.2000;283:2829-2836.PubMed
3.
Guyatt  GHJuniper  EFWalter  SDGriffith  LEGoldstein  RS Interpreting treatment effects in randomised trials.  BMJ.1998;316:690-693.PubMed
4.
Duncan  PWJorgensen  HSWade  DT Outcome measures in acute stroke trials: a systematic review and some recommendations to improve practice.  Stroke.2000;31:1429-1438.PubMed
5.
Walter  SD Number needed to treat (NNT): estimation of a measure of clinical benefit.  Stat Med.2001;20:3947-3962.PubMed
6.
NINDS rt-PA Stroke Group Tissue plasminogen activator for acute ischemic stroke.  N Engl J Med.1995;333:1581-1587.PubMed
7.
Wardlaw  JMdel Zoppo  GYamaguchi  T Thrombolysis for acute ischaemic stroke [Cochrane Review on CD-ROM].  Oxford, England: Cochrane Library, Update Software; 2000:CD000213.
8.
Samsa  GMatchar  DGoldstein  L  et al Utilities for major stroke: results from a survey of preferences among persons at increased risk for stroke.  Am Heart J.1998;136:703-713.PubMed
9.
Lai  SMDuncan  PW Stroke recovery profile and the modified Rankin assessment.  Neuroepidemiology.2001;20:26-30.PubMed
10.
von Kummer  R Brain hemorrhage after thrombolysis: good or bad?  Stroke.2002;33:1446-1447.PubMed
×