Effect of adding extra risk indicatorson the attributable fraction (AF), exposure rate (ER), and number needed tobe treated (NNT).
Smit F, Ederveen A, Cuijpers P, Deeg D, Beekman A. Opportunities for Cost-effective Prevention of Late-Life DepressionAn Epidemiological Approach. Arch Gen Psychiatry. 2006;63(3):290-296. doi:10.1001/archpsyc.63.3.290
Clinically relevant late-life depression has a prevalence of 16% and is associated with substantial societal costs through its disease burden and unfavorable prognosis. From the public health perspective, depression prevention may be an attractive, if not imperative, means to generate health gains and reduce future costs.
To target high-risk groups for depression prevention such that maximum health gains are generated against the lowest cost.
Population-based cohort study over 3 years.
General population in the Netherlands.
Twenty-two hundred community residents aged 55 to 85 years. Of these, 1925 were not depressed at baseline.
Main Outcome Measure
The onset of clinically relevant depression was measured with the Center for Epidemiological Studies Depression Scale. For each of the risk factors (and their combinations), we calculated indices of potential health gain and the effort (costs) required to generate those health gains.
One in every 5 cases of clinically relevant late-life depression is a new case. Consequently, depression prevention has to play a key role in reducing the influx of new cases. This is best done by directing prevention efforts toward elderly people who have depressive symptoms, experience functional impairment, and have a small social network, in particular women, as well as people who have attained only a low educational level or who suffer from chronic diseases.
Directing prevention efforts toward selected high-risk groups could help reduce the incidence of depression and is likely to be more cost-effective than alternative approaches. This article further shows that we have the methodology at our disposal to conduct ante hoc cost-benefit analysis in preventive psychiatry. This helps set a rational research and development agenda before testing the cost-effectiveness of interventions in time-consuming and expensive trials.
Late-life depression is characterized by high prevalence, unfavorable prognosis, reduced quality of life, and excess mortality.1- 8 It is also associated with substantial societal costs.9- 12 Late-life depression is further characterized by a large annual influx of new cases because we have found that 1 in every 5 cases is a new case. From the public health perspective, depression prevention may thus be an attractive, if not imperative, means to generate health gains in the population and to reduce future costs.13
In this context, it should be noted that depression is a treatable condition.14- 16 However, according to a recent estimate, the total disease burden associated with depression can only be reduced about 50%, even under a hypothetical regime of optimal evidence-based treatment.17 This is another reason why prevention has to play an important role in public health. Recently, a meta-analysis of randomized trials of preventive interventions has shown that the incidence of depressive disorder can be reduced by 30%, and this may indicate that prevention is a viable option.18
However, developing preventive interventions and testing their cost-effectiveness in randomized trials is time-consuming and expensive.19 Therefore, one would like to be able to estimate the cost-effectiveness of future interventions at the earliest possible stage in the development and evaluation cycle and target research efforts where they are likely to generate optimal yields. The aim of this article is to describe a methodology that could help identify cost-effective preventive interventions at the earliest possible stage and apply this methodology to the case of late-life depression.
The methodology of identifying high-risk groups for prevention is not new,20,21 but in the field of psychiatric epidemiology and prevention research, it has rarely been applied. The reason for this omission is that this methodology requires longitudinal data on the incidence of the disorder and its putative risk indicators in the general population. These data are often not available, but once there, they offer a wealth of information and can be employed to set a rational research agenda in the field of preventive psychiatry.
The analyses were based on the data of the first 2 waves of the Longitudinal Aging Study Amsterdam. The sampling and procedures of this study have been described in the ARCHIVES in detail.22 At baseline, we interviewed 3056 community residents aged 55 to 85 years. Participating subjects gave their informed consent and underwent face-to-face interviews in their homes. The random sample was stratified by age and sex. The older age strata and men were oversampled in anticipation of higher attrition rates among these groups during the course of the study. After 3 years (mean ± SD time, 1115 ± 59 days), 2200 subjects (72%) were successfully reinterviewed. Loss to follow-up had occurred among 856 subjects, mainly because subjects were too ill or were no longer alive at the time of the first follow-up. Predictors of loss to follow-up were older age, male sex, lower education, functional limitations, chronic diseases, and cognitive decline, but not depression status at baseline.22 Corrective weights were used to account for the joint effect of intentional oversampling and accidental attrition (see the subsection “Analysis”).
Depression was ascertained with the Center of Epidemiological Studies Depression Scale (CES-D).23 It consists of 20 items and its total score has a range between 0 and 60. Scores greater than or equal to 16 indicate clinically significant levels of depressive symptoms.24 In the remainder of this article, CES-D scores of 16 or greater will be referred to as “depression.” At this cutoff, sensitivity is 100% and specificity is 88% for major depressive disorder in the elderly Dutch population.25 Measurements were taken at baseline (t0) and at first follow-up (t1). A person was deemed to be an incident case when 3 criteria were met: (1) absence of depression at t0 (CES-D<16), (2) presence of depression at t1 (CES-D≥16), and (3) significant change between t0 and t1 (change of CES-D score≥5).
Criterion 1 was used to ensure that the analysis was restricted to the group at risk, criterion 2 to ascertain depression status at t1, and criterion 3 to prevent false-positive cases due to measurement error in the CES-D. For the third criterion, we chose a minimum change of 5 CES-D points because it represents, in clinical terms, a medium to large change26 and has the advantage that it has also been used in other studies.22 In addition, a change of 5 scale points on the CES-D is greater than 3.5, which on this scale corresponds to the threshold for statistically reliable change.27 In short, a person was deemed to be an incident case when there was a change of 5 points or more, thereby crossing the cutoff of 16.
Following the vulnerability-stress theory28 and a recently published review on risk indicators of late-life depression,29 the following putative risk indicators were included. The demographics were female sex (1 = female, 0 = male); age older than 65 years, the age at which 30% of the sample makes a significant transition in their life because of retirement (1 = older than 65 years, 0 = younger); low education (dichotomized into 1 = elementary school, 0 = high school and higher); and living in an urban environment (1 = living in Amsterdam, 0 = living elsewhere).
We also included chronic illnesses30 (dichotomized, 1 = 2 or more, 0 = 1 or none), such as diabetes mellitus, chronic obstructive lung disease, cardiac disease, arthritis of knee or hip, and cancer, and cognitive impairment31 (1 = Mini-Mental State Examination score<24, 0 = Mini-Mental State Examination score of 24-30). Earlier studies have indicated that it is not so much the presence of chronic medical conditions that predict the onset of depression, but rather the functional limitations that may stem from them, the subjective appraisal of one's health, and the degree by which one's sense of mastery (locus of control) is affected.4,32,33 Therefore, the following measures were also included: functional limitations34 (1 = 1 or more, 0 = none), self-rated poor health35 (1 = poor health; 0 = sometimes good/sometimes bad, fair, good, or excellent health), and low mastery36 (1 = score below the 50th percentile on the scale, 0 = score above 50th percentile). Depressive symptoms (1 = CES-D scores between 5 and 15; 0 = CES-D<5, ie, below 50th percentile) at baseline were also relevant because they can act as precursors of a CES-D score greater than 16. The distribution of the CES-D is as follows: 25% of the population falls in the range of 0 to 2, 50% in 0 to 5, 75% in 0 to 10, and 90% in the range of 0 to 16.
Finally, social vulnerability was assessed by 2 additional measures: small social network (1 = below, 0 = above the median social network size of 13 persons) and widowhood (1 = ever widowed, 0 = other). All risk indicators were measured at t0 and were coded 1 as the index category for the (presumably) elevated risk status and 0 for the reference category. Dichotomization was carried out prior to the analysis.
Analyses took into account that the data were generated by a sampling design with intentional oversampling of men and the older age strata and loss to follow-up. This was done by weighting the data such that the multivariate distribution over sex and age in the sample was exactly the same as in the general Dutch population in the age range of 55 to 85 years as reported by Statistics Netherlands for the year 2002. To obtain correct 95% confidence intervals and P values under weighting, all variance-related statistics were obtained with help of the first-order Tailor-series linearization method as implemented in Stata SE version 7.0.37 Weighted numbers are reported, rounded to the nearest integer, throughout the remainder of this article. The (weighted) analyses were based on the 1925 people at risk of becoming depressed, ie, the group without depression at the baseline. The subsequent analyses were carried out in several steps.
The exposure rate (ER) of each risk indicator was calculated on the basis of the weighted data. The ER gives the percentage of the elderly population exposed to the risk indicator (Table 1).
For each risk indicator, we got the incidence rate ratio (IRR) by regressing the outcome (1 = incident case, 0 = not an incident case) on the risk indicator in a weighted Poisson regression model. The IRRs were based on person-time data to account for the small differences in follow-up time between t0 and t1 across the subjects. The effect of each of the risk indicators was evaluated while adjusting for all other variables in the risk set. The IRR describes how much larger the incidence rate is in the exposed group relative to the incidence rate in the unexposed group, controlling for competing risks. Incidence rate ratio values larger than 1 signify an increased risk level in the exposed group and values smaller than 1 indicate a risk reduction.
A maximum-likelihood estimate of the population attributable fraction (AF) was obtained with the aflogit procedure in Stata for each of the risk indicators under a Poisson regression while adjusting for competing risks.38 When converted into a percentage, the AF denotes by how many percentage points the current incidence rate of depression in the population would be reduced if the adverse effect of the risk indicator is completely blocked.20,39 This equals the maximum possible impact of a completely successful preventive intervention. Because it cannot be realistically assumed that preventive interventions are completely successful in containing the adverse effect of a risk indicator, it follows that the AF statistic represents the upper limit of the potential health gain in the population.
Although it is possible to adjust the AF statistic for interventions that are not completely effective,21 it is also understood that we need not correct the AF statistic for the purpose of this article: a measure of relative performance is good enough for ranking risk indicators by their utility for prevention. We return to the interpretation of the AF later.
Finally, the number needed to be treated (NNT) of each risk indicator was calculated as the inverse of the absolute risk difference. The latter was obtained by regressing the incidence on a risk indicator in a linear probability model while adjusting for all other competing risks in the model. The NNT denotes how many people should receive a preventive intervention to avoid 1 new case of late-life depression. Again we do not expect that preventive interventions are completely successful, so it is understood that the NNT represents the lower limit of the effort that is required to generate a health gain in the population.
To summarize, we calculate the ER, the strength of association between risk indicator and outcome (IRR), the maximum achievable health gain (AF), and the minimum effort to generate that health gain (NNT). Together, these indices of impact and effort allow us to select high-risk groups for which depression prevention is likely to be associated with the highest health benefit in the population against the lowest cost.
This selection process was carried out as follows. First, we computed ER, IRR, AF, and NNT for all risk indicators simultaneously (Table 2). Then, using conventional backward-stepping procedures, we selected the smallest set of risk indicators in which each risk indicator has a unique and significant contribution to the prediction of depression (Table 3). From this list, the most promising risk indicator was then selected (with the highest IRR and AF and lowest ER and NNT). This was followed by consecutively selecting and adding risk indicators in such a way that the values for the potential health benefit (IRR and AF) were kept as high as possible and the values for effort and cost (ER and NNT) as low as possible. This process of maximizing and minimizing is depicted in the Figure.
Finally, when the economical costs of late-life depression are known, then the cost figures can be combined with the AF and the NNT. This gives an indication of the dollar value of both the costs and savings of a future prevention. The method of this ante-hoc health-economical evaluation is straightforward but best illustrated with real data.
One hundred fifty-eight people (weighted N) became incident cases of depression in a sum total of 5643 weighted person years. This translates into 2.8 incident cases per 100 person years. This weighted estimate is very close to the unweighted estimate of 170 cases per 5861 person years, which is equal to 2.9 incident cases per 100 person years.
At t0, the (weighted) study cohort consisted of 2200 people. Of these, 1925 (87.5%) were at risk of becoming depressed, of whom 158 (8.2%) became incident cases after 3 years at t1. Table 1 describes the exposure status of both the group at risk and the incident group. As expected, the exposure rates are often significantly elevated in the incident group as compared with the group at risk (Table 1).
Table 2 shows the ER, the IRR, the population AF, and the NNT for each of the risk indicators after adjusting for the effects of all other risks in the model. The factors female sex, having 2 or more chronic diseases, experiencing functional limitations, and having an above-average number of depressive symptoms are associated with significant IRR, AF, and NNT values (Table 2).
In a next step, we obtained a more parsimonious multivariate model with fewer risk indicators. This model is based on the smallest subset of statistically significant risk indicators (at α<.05) and was obtained using the backward-stepping selection method in the respective regression equations. The rationale of this approach is that risk indicators are competing with each other and we need to select only the most competitive risk indicators. The results are presented in Table 3.
In the parsimonious model, a smaller number of risk indicators was retained (Table 3). These were female sex, low education, having 2 or more chronic diseases, experiencing functional limitations, having an above-average number of depressive symptoms, and having a small social network.
With this risk profile, 82.8% of the future cases of clinically relevant depressive disorder can be predicted. In the complete model with all risk indicators (Table 2), the percentage was 86.2%. The implication is that the parsimonious risk profile is nearly as good for predictive purposes as the one that contained all available risk indicators.
In a next step, we assessed the potential health benefits when prevention efforts target people who are exposed to combinations of risk indicators (Table 4). We took depressive symptoms as a starting point because this risk indicator is associated with the highest IRR and AF values and has the lowest NNT. From Table 4, it is clear that when these subsyndromatically depressed people also experience impairment, then the IRR and AF values rise and ER and NNT values drop even further (row 5 in Table 4). In other words, the joint exposure to both depressive symptoms and functional limitations is associated with better values overall.
The indices of impact and effort can be optimized further when these people also have a smaller than average social network. This group represents 11.7% of the population and has a risk of becoming depressed that is higher by a factor of 4.5; if the adverse effect of the exposure to all 3 risk indicators could be blocked completely, then the incidence rate of depression in the population would drop by 32.3%.
Preventive interventions are likely to become even more cost-effective when they target people who are in addition suffering from chronic diseases (row 15), have attained only low education levels (row 14), or are women (row 13). It is of note that the latter risk profiles correspond to relatively small groups (6.5%-8.0% of the population). Smaller target groups will reduce the effort that has to be put into the logistics of the future interventions. The interventions are also likely to be more efficient, judging by the low NNT value of only 4.
This minimization/maximization process is depicted graphically in the Figure. It shows how consecutively adding a risk indicator to the risk profile impacts the AF, ER, and NNT. In general, one would like to have the AF curve as high as possible (to maximize the health benefit in the population) while, at the same time, it is better to have the ER and NNT curves as low as possible (to target smaller population segments with greater efficiency). Thus, in the first step, the risk indicator “presence of depressive symptoms” yielded an AF of 40.3%, indicating that the incidence rate of depression in the population would be reduced by as much as 40.3% when the occurrence of full-blown depressive disorder is prevented in all people with depressive symptoms. The ER indicates that this is a formidable task because 40.3% of the population should then receive the intervention. Furthermore, the efficiency of such an approach is not optimal, because—as the NNT indicates—16 people of this target group must receive the intervention to avoid 1 new case of depressive disorder. This can be improved. In a next step, there are several risk indicators to chose from, but “functional limitations” keep the level of AF still high (now 37.5%) while a smaller segment of 20.8% of the population has to be targeted and the efficiency is better (NNT = 7). This process of adding new risk indicators continues until a target group is selected that is associated with the best IRR, AF, ER, and NNT values.
Consider the risk profile in row 13 of Table 4: the risk of becoming depressed is a factor 4.6 higher in women who have some depressive symptoms, experience functional limitations, and have a small social network. If the adverse effect of the joint exposure could be completely blocked, then the incidence rate in the population would be reduced by 24%. The current incidence rate of 2.8 cases per 100 person-years would then become 2.8 × (1−0.24) = 2.1 cases per 100 person-years. In every 1 million older adults, this would represent roughly 5950 prevented cases per year. Of course, this number represents an upper limit of the achievable health gain because preventive interventions are unlikely to be totally effective. Assuming that an intervention is 30% successful in avoiding new onsets,18 then 1785 onsets will be avoided. This still represents a substantial health gain in a source population of 1 million.
Avoiding new onsets has economical ramifications. Katon and colleagues40 computed the excess costs of minor and major depression as at least $1045 per person per half year. They called this a conservative estimate. Preventing 1785 onsets would thus result in a cost saving of at least $1.9 million for every million elderly people in the population.
It is also clear that the costs of the intervention will be balanced by its savings when the costs of the intervention do not exceed $1045 per avoided case. In the same vein, the NNT value can be used to calculate the costs per recipient of the intervention where the break-even point is reached with the costs and benefits of an intervention. This type of ante hoc cost-benefit calculation can be used to select interesting preventive interventions for further cost-effectiveness research.
We aimed to answer the question as to whether it would be possible to target depression prevention where it generates the best health benefits against the lowest cost. This would also guide research efforts toward more promising research areas in preventive psychiatry.
This study showed that it is possible to use longitudinal epidemiological data to select “cost-effective” risk indicators. These are risk indicators that are associated with a substantial population AF and a low NNT. When the costs of the disorder are known from a cost-of-illness study, then it is also possible to combine the AF and NNT with the known costs into an ante-hoc cost-benefit analysis.
In short, we have the methodology at our disposal that could help identify cost-effective preventive interventions at a very early stage of the costly and time-consuming cycle of development and evaluation of preventive interventions. Having said this, we need to add that ultimately the cost-effectiveness of preventive interventions has to be established in proper cost-effectiveness studies.
The methodology of identifying interesting risk indicators for prevention is not new,20,21 but in the field of psychiatric epidemiology and prevention research, it has rarely been applied. The reason for this omission is that this methodology requires unselected population-based longitudinal data on the incidence of the disorder and its putative risk indicators. These data are often not available, but once there, they could be used to set a rational research and development agenda for preventive psychiatry. We applied this to late-life depression and came up with the following key findings.
First, the incidence of clinically relevant late-life depression is 2.8 new cases per 100 person-years. This must be seen as a conservative estimate because rigorous criteria were used to avoid false-positive onsets. Therefore, the true incidence rate is likely to be higher.
Second, starting from a list of putative risk indicators, only a few were identified as interesting from the prevention perspective when the effects of the risk indicators are adjusted for competing risks. These are female sex; low education; having 2 or more chronic illnesses; experiencing functional limitations; having above-average symptom levels of depression not exceeding the threshold for clinically relevant depression, ie, CES-D scores above 5 and below 16; and, finally, having a small social network.
Third, the combined effect of being exposed to 3 or 4 selected risk indicators yield statistically significant and substantially interesting values on measures of potential health benefit (IRR, AF) and effort (ER, NNT). It is also worth noting that the joint exposure to more risk indicators is limited to a small fraction of the older population. The intervention thus has a narrow focus and the corresponding number of people is manageable from the prevention perspective at a regional level. Even based on conservative assumptions, a preventive intervention could prevent 1785 new onsets in every 1 million older adults.
Fourth, avoiding new onsets also has economical consequences. A study carried out in the United States shows that elderly people with minor or major depression generate on average $1045 in excess costs per person per half year. Again, this is a conservative estimate. Avoiding the aforementioned 1785 onsets would thus create substantial savings. However, these savings are unlikely to completely offset the costs of a preventive intervention, but the cost savings form a good vantage point for cost-effective prevention of late-life depression.
The findings have to be placed in the context of the strengths and limitations of this study. The strengths are the use of population-based data; the prospective design, which enables the study of incidence and facilitates etiological inference; and the measurement of exposures that is not biased due to post hoc rationalization on the part of the respondents, because at t0 they could not have any knowledge about their future health state at t1. Furthermore, this study is among the first to show how a statistical technique could be applied to quantify potential health benefits and the effort required to generate these health benefits in the field of preventive psychiatry. It thus supplies the sort of methodology that is important for setting a rational research and development agenda for depression prevention.
The limitations of this study consist of the measurement of the exposures, which was not very detailed. We do not know for how long and how intensely subjects were exposed. Moreover, the number of studied risk indicators is limited in that, for example, genetic and other biological risk indicators were not included. Another limitation is the measurement of depression with the CES-D, which is not a diagnostic instrument. However, it has good psychometrical properties. People who are exposed to several risk factors may well form a population segment that is not very responsive to health-oriented interventions, and this may affect the health gain that can be delivered by prevention. This important issue needs more research. Finally, it is not safe to generalize the economical findings of this study to countries other than the United States because excess costs are likely to differ from one country to another.
Conceptually, it would be useful to distinguish between risk indicators that are amenable, such as depressive symptoms, from those that are not. It is also worth noting that some risk indicators are not modifiable, such as chronic illnesses, but their adverse psychological effects might be contained. Finally, there are risk indicators, such as female sex, that are not modifiable but are valuable from the perspective of identifying groups at risk—which was the principal aim of this article.
In future research, thought should be given to the quantification of success rates of preventive interventions; the quantification of impacts on the incidence not only of major depressive disorder, but of the whole spectrum of anxiety and mood disorders; and the accommodation of cost-benefit considerations in this sort of analysis. It would also be valuable to cross-validate the methodology presented in this article by comparing it with alternative methodologies, such as Classification and Regression Tree analysis.41 We are currently conducting such a study. It is recommended that in the future, the methodology presented in this article or related methodologies17,42 should be further developed and employed to direct both research and prevention where they are likely to be most cost-effective.
Correspondence: Filip Smit, MSc, Trimbos Institute, Netherlands Institute of Mental Health and Addiction, PO Box 725, 3500 AS, Utrecht, The Netherlands (firstname.lastname@example.org).
Submitted for Publication: March 11, 2004; final revision received August 6, 2004; accepted April 27, 2005.
Funding/Support: This study is financially supported by grant 5753 from the Netherlands Fund of Mental Health, Utrecht. It is based on the data that were collected in the context of the Longitudinal Aging Study Amsterdam,which is financed by the Netherlands Ministry of Welfare, Health, and Sports, the Hague.
Acknowledgment: We acknowledge, with many thanks, Jan Poppelaars, MSc, and Mariette Westendorp, MSc, for their contributions to data collection.