Receiver operating characteristic curves comparing self-rated, spouse-rated, and combined spouse-rated and self-rated health as predictors of mortality. For this analysis, self-rated and spouse-rated health were recoded as 1 (excellent), 2 (very good), 3 (good), 4 (fair), or 5 (poor). χ21 = .36, P = .54 for self-rated health vs spouse-rated health. χ21 = 6.72, P = .009 for self-rated health vs combined spouse-rated and self-rated health. χ21 = 2.91, P = .08 for spouse-rated health vs combined spouse-rated and self-rated health.
Ayalon L, Covinsky KE. Spouse-Rated vs Self-rated Health as Predictors of Mortality. Arch Intern Med. 2009;169(22):2156-2161. doi:10.1001/archinternmed.2009.386
The Health and Retirement Study is a national sample of Americans older than 50 years and their spouses. The present study evaluated cross-sectional and longitudinal data from January 2000 through December 2006. The objective of the study was to evaluate the roles of spouse-rated vs self-rated health as predictors of all-cause mortality among adults older than 50 years.
A total of 673 dyads of married couples were randomly selected to participate in a Health and Retirement Study module examining spouse-rated health. For each couple, one member was asked to rate his or her overall health status, and his or her spouse was asked to report the partner's overall health status. Mortality data were available through 2006.
Our findings demonstrate that spouse-rated health (area under the curve, 0.75) is as strong a predictor of mortality as self-rated health (area under the curve, 0.73) (χ21 = 0.36, P = .54). Combining spouse-rated and self-rated health predicts mortality better than using self-rated health alone (area under the curve, 0.77) (χ21 = 6.72, P = .009).
Spouse ratings of health are at least as strongly predictive of mortality as self-rated health. This suggests that, when self-rated health is elicited as a prognostic indicator, spouse ratings can be used when self-ratings are unavailable. Both measures together may be more informative than either measure alone.
There is ample research demonstrating that self-rated health is an excellent predictor of mortality that often outperforms objective indicators of health and mental health status.1 Even after controlling for age, sex, and other demographic variables, the predictive power of self-rated health remains stable over time.1,2 To account for the strong predictive value of self-rated health, it has been hypothesized that self-rated health either measures health status data that cannot be captured by existing technology (such as respondents' inner biologic and physiologic health processes) or serves as a self-fulfilling prophecy.1,3 Other investigators have suggested that maintaining positive health habits leads one to perceive his or her health more positively than expected based on objective measures alone or that optimism, as manifested in more positive self-rated health, is beneficial in itself.2 Finally, self-rated health has also been proposed as a proxy of emotional status (such as depression or other emotional problems), which is known to account for mortality.2
To date, no research has been performed on the usefulness of spouse-rated health as a predictor of all-cause mortality. Nevertheless, it is possible that the way individuals perceive the health of their spouse is of prognostic value. One study4 evaluated the usefulness of proxy report against mortality data. This study found that, after adjustment for health and sociodemographic data, spouse-rated limitations and life expectancy were predictive of husbands' mortality but not of wives' mortality, arguing that wives are more astute judges of their husbands' mortality risk than the other way around.
Most research concerning proxy reports has instead focused on concordance between proxy and respondent with regard to reports of quality of life5- 11 and activities of daily living.12- 14 These studies have shown that the degree of concordance between proxy and respondent is moderate at best, with proxies tending to view respondents' quality of life as lower and functional impairment as greater than respondents do.5,7,15 Few of these studies also evaluated proxy-respondent concordance concerning medical status, with most studies12,14,15 (but not a study by Lerner-Geva et al16) reporting reasonable agreement.
The present study evaluated spouse vs respondent reports of health status as predictors of mortality among adults older than 50 years. Given past research on the degree of concordance between respondent and proxy, we expected to find moderate agreement with regard to respondents' health status. We further expected spouse-rated health to capture not only self-rated health but also other sociodemographic and clinical characteristics of both respondent and spouse. Finally, we expected both spouse-rated health and self-rated health to serve as independent risk factors for respondents' mortality.
The Health and Retirement Study (HRS) is a nationally representative sample of individuals older than 50 years living in the United States.17 The HRS is sponsored by the National Institute on Aging, Bethesda, Maryland, and is conducted by the University of Michigan, Ann Arbor. The study was reviewed and approved by the University of Michigan's Health Sciences Institutional Review Board. Participants take part in a biennial interview that covers a range of topics, including income, wealth, work, retirement, health, health care use, and other factors. Most interviews are conducted over the telephone.
The present study evaluated cross-sectional and longitudinal data from January 2000 through December 2006. Baseline data were collected in 2000. Overall, 19 580 individuals responded to the 2000 HRS questionnaire. In addition to the core interview, each wave of the HRS includes additional modules on selected topics that are administered to randomly selected participants. Randomization is computerized and is conducted by the University of Michigan. The potential analytic sample for this study was 747 married participants who were randomly selected to participate in a module designed to compare respondent and spouse responses to questions about respondents' health status. Of 747 proxies, 66 were excluded because the spouse was expected to be unable to provide data on his or her own health status based on low cognitive performance demonstrated in past waves or known health problems, and 8 were excluded because the spouse was unable to provide data on his or her own health status at the current wave. Therefore, baseline data were available on 673 respondent-spouse dyads. These pairs are representative of all coupled respondents in the 2000 HRS sample. Follow-up mortality data were available through 2006.
Mortality status (alive or dead) was available through 2006 based on HRS tracking efforts. Respondents were classified by HRS into 1 of the following 5 categories: (1) alive in 2006, (2) presumed alive as of 2006, (3) death reported in 2006, (4) death reported in past waves, or (5) vital status unknown. In the present study, the first 2 categories were collapsed as alive, the next 2 categories were collapsed as dead, and the fifth category was classified as missing (1.1% of the sample).
Proxies were asked to rate the health status of their spouse. The following 5-point scale was used: 5 (excellent), 4 (very good), 3 (good), 2 (fair), and 1 (poor).
All participants were asked to rate their own health status. The same 5-point scale was used.
The number of chronic medical conditions among 6 common conditions (eg, cancer, diabetes mellitus, and stroke) was gathered based on self report. We dichotomized subjects as those with 0 to 1 medical conditions vs multiple conditions.
Participants were asked to indicate the presence or absence of impairment in 9 activities (eg, difficulties in stooping and difficulties in sitting). Participants were then asked whether they provide assistance with activities of daily living or instrumental activities of daily living to their spouse.
The Center for Epidemiologic Studies Depression Scale is a common measure of depressive symptoms that has been used in various population-based studies.18 The HRS uses 8 items from the scale. A cutoff score of at least 3 was used to represent depression.19 However, this is only a measure of depressive symptoms and not of clinical depression.
We used the HRS Cognitive Scale, a test of overall cognitive functioning that includes subtests of immediate and delayed word recall, subtraction, and backward count. The subtests were modeled after the Mini-Mental State Examination, a standard geriatric dementia screen.20,21 Because scores are highly correlated with each other, we used a composite score ranging from 0 to 26, with 26 representing perfect performance.
Participants were asked whether they participate in any vigorous physical activity, smoke cigarettes, or drink alcohol. Answers were recorded as yes or no.
Sociodemographic data were based on self-report. Data recorded included sex, age (<65, 65-74, or ≥75 years), education status (0-12 or ≥13 years), and race/ethnicity (white, black, Latino, or other).
We performed χ2 analyses to compare spouses and respondents on various categorical demographic and clinical characteristics. t Tests for dependent samples were performed to compare the 2 parties on continuous variables. We also used the Wilcoxon signed rank test to compare spouses' and respondents' reports of health status.22 We then evaluated the degree of concordance between spouse-rated health (of the respondent) and self-rated health using χ2 analysis and the weighted κ statistic23 via a linear set of weights (eg, 1.0, 0.75, 0.50, 0.25, and 0). The κ statistic examines the degree of agreement beyond what would occur by chance. A κ statistic of 0 indicates that the level of agreement is no more than would be expected by chance alone, while a κ statistic of 1.0 indicates perfect agreement. Next, we identified the unique association of the various covariates with spouse-rated health over and above their associations with self-rated health. We first ensured that all variables complied with the proportional odds assumption and then performed a series of proportional odds regression analyses with spouse-rated health as an outcome variable, each sociodemographic and clinical correlate as a potential predictor, and self-rated health as a control variable. We then performed logistic regression analyses comparing spouse-rated health with self-rated health as potential predictors of mortality. We repeated the same logistic regression analyses with adjustment for various demographic and clinical characteristics (eg, respondents' age, sex, education status, medical status, functional impairment, depression, cognitive status, physical activity, smoking, alcohol drinking, and cognitive functioning, as well as spouses' depression, cognitive status, and caregiving status). Finally, we calculated receiver operating characteristic curves24 to compare the predictive ability of spouse-rated health against self-rated health, as well as their combined predictive ability (ie, a sum of both health ratings).25,26 An area under the curve of 1.0 represents perfect predictive ability, whereas an area under the curve of 0.5 represents worthless predictive ability. The statistical comparison between the 2 curves was performed using commercially available software (roccomp command in STATA; StataCorp LP, College Station, Texas) that relies on logistic models for estimating the curves.26
Overall, 673 dyads participated in the study. Most spouses (54.0%) were female. Most spouses (87.7%) and respondents (86.4%) were white (Table 1). Although differences between respondents and spouses were statistically significant for some characteristics, the magnitude of the difference was small in most cases.
As summarized in Table 2, the degree of concordance between spouse-rated health and self-rated health was moderate (weighted κ = 0.48, P < .001). This suggests that the level of agreement between respondents and spouses was at about the midpoint between no better than chance and perfect. The Wilcoxon signed rank test showed that spouses tended to rate respondents' health slightly worse than respondents did (mean [SD] score, 3.22 [0.05] by spouses and 3.37 [0.05] by respondents; z = 4.3, P < .001).
Using P < .01 as the level of statistical significance to account for the large sample size and the multiple comparisons,27 all variables met the proportional odds assumption. In a series of proportional odds regression analyses, we found several respondent-level and spouse-level variables to be associated with spouse-rated health even after controlling for self-rated health (Table 3). Respondents of younger age, higher education, fewer medical conditions, higher cognitive functioning, and less functional impairment were more likely to be rated by their spouse as enjoying better health, as were those who engaged in physical activity. In addition, nondepressed spouses and spouses with better cognitive functioning were more likely to rate their partner's health as better independent of their partner's self-rated health. Hence, these various demographic and clinical characteristics of both respondent and spouse contribute to spousal perception of respondents' health even after respondents' self-rated health is considered.
Overall, 94 of 673 respondents (12.3%) died during the 6-year period. As summarized in Table 4, better self-rated health is associated with lower mortality risk (F4,47 = 12.5, P < .001) based on logistic regression analysis. Similar results were obtained for spouse-rated health (F4,47 = 18.0, P < .001). Both spouse-rated and self-rated health remained significant predictors of mortality even after adjusting for various demographic and clinical characteristics (eg, respondents' age, sex, education status, medical status, functional impairment, depression, cognitive status, health behaviors, and cognitive functioning, as well as spouses' depression, cognitive status, and caregiving status).
Areas under the curve indicated moderate predictive ability for both spouse-rated and self-rated health. Spouse-rated health (area under the curve, 0.75) was not significantly better than self-rated health (area under the curve, 0.73) (χ21 = 0.36, P = .54). Combining spouse-rated and self-rated health provided the best predictive value of mortality (area under the curve, 0.77), which was significantly better than self-rated health alone (χ21 = 6.72, P = .009) but not significantly better than spouse-rated health alone (χ21 = 2.91, P = .08) (Figure).
Several studies1,2,28- 30 to date have evaluated the role of self-rated health as a predictor of all-cause mortality, and its predictive ability has been demonstrated in various epidemiologic studies. To our knowledge, this is the first study to show that among older adults spouse-rated health is as predictive of mortality as self-rated health, despite the fact that the 2 are not synonymous. Our findings indicate only moderate levels of concordance between spouse-rated health and self-rated health. Furthermore, the study demonstrates that spouse-rated health is correlated with other respondent-level and spouse-level variables in addition to self-rated health; spouse-rated health captures respondents' medical status, demographic characteristics, cognitive status, health behaviors, and mental status. As demonstrated in past research,31 spouse-rated health also is correlated with proxies' cognitive status and mental status, so that more depressed or cognitively impaired proxies tend to report their spouse's health as poorer even after controlling for self-rated health.
Similar to the case of self-rated health, it remains unclear what accounts for the strong predictive value of spouse-rated health. Are spouses more attuned to certain biologic and physiologic processes in their partner that remain otherwise unnoticed? Is it the mental condition of the spouse that has such an important effect on respondent mortality prospects? Or is it the relationship between the 2 partners and the expectations that spouses hold regarding their partner that affect their partner's mortality prospects? At this point, the exact mechanism behind the predictive ability of spouse-rated health remains unclear. Nevertheless, our study shows that combining spouse-rated health with self-rated health provides a better prognostic indicator of all-cause mortality than self-rated health alone.
The present study has several limitations that should be acknowledged. First, the study was limited to married couples. Research has shown that the quality of the relationship between proxy and respondent has a major role in determining the degree of concordance between respondent-rated and proxy-rated variables.8 Hence, the present results may not be generalizable to other types of proxies. Furthermore, the HRS is a representative sample of individuals older than 50 years; hence, results may not be representative of individuals 50 years or younger. Second, the study was limited to cognitively intact proxies and respondents. Therefore, results cannot be generalized to cognitively impaired participants. Third, no blinding measures were in place, and it is possible that spouses were present during their partner's interview. Although we cannot be sure how this might have affected the results, we think it is unlikely that it had a major effect. The administration of the HRS interview takes a fairly long time, and many of the questions have similar response choices. Although a spouse may have heard his or her partner providing answers, it is unlikely that he or she would have known which specific question was being answered. Perhaps most important, our results emphasize that spouse ratings of health are often discordant. As a result, any bias resulting from awareness of partner response would bias our results toward the null, suggesting that our results are even more conservative. Fourth, the study did not evaluate the predictive ability of spouse-rated health for purposes other than mortality. Fifth, we acknowledge that there are no standards for how to recognize the clinical significance of differences in receiver operating characteristic curves. However, many investigators would view the receiver operating characteristic curve difference of 0.04 (self-rated health compared with combined spouse-rated and self-rated health) as clinically meaningful. Consider a situation in which you have pairs of patients, one of whom survived longer than the other. Using self-rated health only, you would correctly identify the longer surviving patient 73% of the time. Using the combined spouse-rated and self-rated health model, you would correctly identify the longer surviving patient 77% of the time. We believe that this is meaningful and that most would choose to incorporate both spouse-rated and self-rated health based on this information.
Nevertheless, this is the first study to date to evaluate the role of spouse-rated health as a predictor of mortality. Our findings demonstrate that spouse-rated health is at least as strong a predictor of mortality as self-rated health, although the 2 measure different entities: spouse-rated health not only correlates with self-rated health but also captures the sociodemographic and medical status of respondents, as well as spouses' own cognitive and mental status. Spouse-rated health can be used as a predictor of mortality when self-rated health is unavailable or as an additional source of data that complements self-rated health. Health care practitioners working with older adults should attempt to obtain not only patients' self-report of their health status but also their spouses' self-report whenever available, as the combination of spouse-rated and self-rated health provides a more accurate estimate of respondents' mortality risk.
Correspondence: Liat Ayalon, PhD, The Louis and Gabi Weisfeld School of Social Work, Bar-Ilan University, Ramat Gan, Israel 52900 (firstname.lastname@example.org).
Accepted for Publication: August 18, 2009.
Author Contributions:Study concept and design: Ayalon and Covinsky. Analysis and interpretation of data: Ayalon and Covinsky. Drafting of the manuscript: Ayalon. Critical revision of the manuscript for important intellectual content: Ayalon and Covinsky. Statistical analysis: Ayalon and Covinsky. Obtained funding: Covinsky. Administrative, technical, or material support: Covinsky.
Financial Disclosure: None reported.
Funding/Support: This study was supported by grants 5K24 AG029812 and 5R01 AG023626 from the National Institute on Aging (Dr Covinsky).