Description of study sample.
Survey questions assessing stroke. Stroke on self-report was defined as an affirmative answer to 1 of these questions.
Reitz C, Schupf N, Luchsinger JA, Brickman AM, Manly JJ, Andrews H, Tang MX, DeCarli C, Brown TR, Mayeux R. Validity of Self-reported Stroke in Elderly African Americans, Caribbean Hispanics, and Whites. Arch Neurol. 2009;66(7):834-840. doi:10.1001/archneurol.2009.83
The validity of a self-reported stroke remains inconclusive.
To validate the diagnosis of self-reported stroke using stroke identified by magnetic resonance imaging (MRI) as the standard.
Design, Setting, and Participants
Community-based cohort study of nondemented, ethnically diverse elderly persons in northern Manhattan.
High-resolution quantitative MRIs were acquired for 717 participants without dementia. Sensitivity and specificity of stroke by self-report were examined using cross-sectional analyses and the χ2 test. Putative relationships between factors potentially influencing the reporting of stroke, including memory performance, cognitive function, and vascular risk factors, were assessed using logistic regression models. Subsequently, all analyses were repeated, stratified by age, sex, ethnic group, and level of education.
In analyses of the whole sample, sensitivity of stroke self-report for a diagnosis of stroke on MRI was 32.4%, and specificity was 78.9%. In analyses stratified by median age (80.1 years), the validity between reported stroke and detection of stroke on MRI was significantly better in the younger than the older age group (for all vascular territories: sensitivity and specificity, 36.7% and 81.3% vs 27.6% and 26.2%; P = .02). Impaired memory, cognitive skills, or language ability and the presence of hypertension or myocardial infarction were associated with higher rates of false-negative results.
Using brain MRI as the standard, specificity and sensitivity of stroke self-report are low. Accuracy of self-report is influenced by age, presence of vascular disease, and cognitive function. In stroke research, sensitive neuroimaging techniques rather than stroke self-report should be used to determine stroke history.Published online May 11, 2009 (doi:10.1001/archneurol.2009.83).
Self-administered questionnaires are frequently used to obtain information about a previous history of stroke, but the validity of self-reported stroke remains inconclusive. In general, self-reports on medical conditions that are well defined and relatively easy to diagnose often have a high positive predictive value, in contrast to conditions characterized by complex symptoms.1Stroke is associated with motor impairment but can also be accompanied by impairments in memory, sensation, and speech or language, diminishing the ability of an individual to accurately report a history of stroke. Although the importance of being aware of these difficulties, particularly in studies among the elderly, has been emphasized,2 most previous studies assessing validity of stroke self-report either used neurological examination or medical record review as the standard or performed brain imaging only for persons reporting to have had a stroke.3- 9 This likely produces underreporting of patients with ambiguous symptoms or silent strokes and, consequently, increases the rate of false-negative results and diminishes sensitivity estimates.
The Washington/Hamilton Heights–Inwood Columbia Aging Project (WHICAP) is an ongoing, community-based study of aging and dementia that comprises elderly participants from an urban community. A unique aspect of the cohort is its multiethnic composition; white, Caribbean Hispanic, and African American participants are included in the sample, which allows for the examination of diverse cultural, educational, medical, and genetic factors as possible modifiers in aging diseases. We previously observed a higher prevalence and incidence of cerebrovascular disease and white matter hyperintensities (WMHs), as well as a larger relative brain volume in African Americans and Hispanics than in whites.10,11 These observations strongly suggest that it is important to take ethnic differences in vascular disease and brain form and structure into account when assessing the validity of self-reported vascular events.
The objective of the present study was to examine the validity of the self-reported history of stroke across ethnic groups in the large, multiethnic WHICAP cohort by calculating sensitivity and specificity of self-reported stroke using magnetic resonance imaging (MRI) as the standard. We also explored whether the validity of stroke self-report differed by age and whether it is influenced by cognitive function, educational level, or specific concomitant diseases.
Participants were part of the original cohort for a prospective study of aging and dementia among Medicare recipients 65 years and older and residing in northern Manhattan.11 These participants were recruited at 2 time points (1992-1994 and 1999-2000) and followed up at regular 18-month intervals. The sampling strategies and recruitment outcomes have been described in detail.10 Recruitment, informed consent, and study procedures were approved by the institutional review boards of Columbia Presbyterian Medical Center and Columbia University Health Sciences and the New York State Psychiatric Institute.
The WHICAP MRI imaging project was concurrent with the second follow-up visit of the cohort recruited in 1999 and the sixth follow-up visit of the cohort recruited in 1992. Participants were deemed eligible for MRI if they did not meet criteria for dementia at their last research assessment (Figure 1). At the conclusion of the first follow-up period, 2113 participants were considered for MRI eligibility; 2053 of these individuals (97.2%) had been seen at the first follow-up visit, and, for 60 of these participants (2.8%), their most recent visit was at baseline (ie, they were not seen during the first follow-up period). Dementia was diagnosed in 272 of these 2113 participants (12.9%). Of the remaining 1841 participants, 769 (41.8%) received MRI scans. Of the 1072 participants who did not receive MRI scans, 407 (38.0%) refused to participate, 166 (15.4%) died before they were able to be scheduled for imaging, 191 (17.8%) were unavailable for follow-up, 283 (26.3%) had MRI contraindications, and 25 (2.3%) were unable to be scheduled. Compared with persons who received MRI scans, those who refused to participate in the MRI study but otherwise met inclusion criteria were a year older, more likely to be women, and less likely to be African American. There were no differences in educational level between the 2 groups.
At each evaluation, participants underwent an in-person interview about general health and functioning, medical history, a physical and neurological examination, and a neuropsychological battery that included measures of memory, orientation, language, abstract reasoning, and visuospatial ability.12 The neuropsychological test battery and its validity in the diagnosis of dementia has been described previously.12 The diagnosis of dementia was based on standard research criteria13 and was established at a consensus conference of physicians, neurologists, neuropsychologists, and psychiatrists using all available information (except the MRI results) gathered from the initial and follow-up assessments and the participants' medical records.
Stroke was defined according to World Health Organization criteria.14 The presence of stroke was ascertained from an interview with participants and/or their informants (caregivers or family members). A positive response to any 1 of the 8 questions shown in Figure 2 was considered to suggest a history of stroke. Persons who answered yes to at least 1 of the 8 questions were referred to see a board-certified neurologist. In addition, 80.1% of self-reported strokes were confirmed by review of medical records, as described previously in detail.15 All persons, independent of stroke self-report status, underwent MRI.
Scan acquisition was performed on an Intera 1.5T scanner (Philips, Amsterdam, the Netherlands) at Columbia University Medical Center and transferred electronically to the University of California, Davis, for morphometric analysis in the Imaging of Dementia and Aging Laboratory. For measures of total brain volume, ventricular volume, and WMH volume, fluid-attenuated inverse recovery (FLAIR)–weighted images (repetition time-[TR], 11 000 milliseconds; echo time [TE], 144.0 milliseconds; inversion time, 2800 milliseconds; field of view, 25 cm; number of excitations, 2; 256 × 192 matrix with 3-mm slice thickness) were acquired in the axial orientation. The T1-weighted images acquired in the axial plane and resectioned coronally were used to quantify hippocampus and entorhinal cortex volumes (TR, 20 milliseconds; TE, 2.1 milliseconds; field of view, 240 cm; 256 × 160 matrix with 1.3-mm slice thickness). The presence or absence of brain infarction on MRI was determined using all available images, including T1-weighted images, FLAIR-weighted images. and proton density and T2-weighted double-echo images. Only lesions 3 mm or larger qualified for consideration as brain infarcts. Signal void, best seen on the T2-weighted images, was interpreted to indicate a vessel. Other necessary imaging characteristics included cerebrospinal fluid density on the T1-weighted image and, if the stroke was in the basal ganglia area, distinct separation from the circle of Willis vessels and perivascular spaces. Scans were further analyzed to determine the number of infarcts, their location (ie, right or left hemisphere, cortical or subcortical, and specific region), and their size. Infarcts of 1 cm or less were defined as small, and infarcts of more than 1 cm were defined as large. Two raters determined the presence of cerebral infarction on MRI. Previously published κ values for agreement among raters has been generally good, ranging from 0.73 to 0.90.16
The WMH volumes were derived from FLAIR-weighted images via a 2-step process. First, an operator manually traced the dura mater within the cranial vault, including the middle cranial fossa but not the posterior fossa and cerebellum. Intracranial volume was defined as the number of voxels contained within the manual tracings, multiplied by voxel dimensions and slice thickness. These manual tracings also defined the border between brain and nonbrain elements and allowed for the removal of the latter. Nonuniformities in image intensity were removed, and 2 gaussian probability functions, representing brain matter and cerebrospinal fluid, were fitted to the skull-stripped image. Once brain matter was isolated, a single gaussian distribution was fitted to image data, and a segmentation threshold for WMH was set a priori at 3.5 SDs in pixel intensity above the mean of the fitted distribution of brain matter. Erosion of 2 exterior image pixels was applied to the brain matter image before modeling to remove partial volume effects and ventricular ependyma on WMH determination. The WMH volume was calculated as the sum of voxels 3.5 SDs or more above the mean intensity value of the image and multiplied by voxel dimensions and slice thickness and adjusted by intracranial volume.
The presence of diabetes mellitus and hypertension was defined as a history of either disorder at any time during life. At baseline, all participants were asked whether they had a history of diabetes or hypertension. If they answered affirmatively, they were asked whether they were currently receiving treatment and about the specific type of medication. Heart disease was defined as a history of atrial fibrillation and other arrythmias, myocardial infarction, congestive heart failure, or angina pectoris at any time during life. A trigger question inquired whether the individual ever smoked at least 1 cigarette per day for 1 year or longer. If the answer to the trigger question was no, the individual was classified as a nonsmoker. and no further questions were asked. Participants who answered the question affirmatively were classified as current smokers if they were still smoking or past smokers if they had quit smoking. In addition, current and past smokers were asked at what age they began smoking and how many cigarettes, on average, they had smoked or still smoked per day. Past smokers were also asked at what age they had stopped smoking.
First, differences in demographic and clinical characteristics between persons reporting stroke correctly and falsely were explored using analysis of variance and the χ2 test. Then, self-report of stroke was classified as follows: true-positive, the person reported a stroke and had a stroke confirmed on MRI; false-positive, a stroke was reported although no stroke was detected on MRI; false-negative, the person reported having no history of stroke but a stroke was detected on MRI; and true-negative, no infarct on MRI and the person reported correctly having no history of stroke. Sensitivity, specificity, and positive and negative predictive values were calculated using the following formulas: sensitivity = true-positive/(true-positive + false-negative); specificity = true-negative/(false-positive + true-negative); positive predictive value = true-positive/(true-positive + false-positive); and negative predictive value = true-negative/(false-negative + true-negative). We first classified a self-reported stroke as a positive answer to any question. Then, to minimize the possibility that we misclassified transient ischemic attacks (TIAs) as strokes, we performed a series of subanalyses. First, we defined self-reported stroke as a positive answer to question 7 or 8 but a negative answer to questions 1 through 6. Then, we defined positive self-report as a positive answer to questions 1 through 6 but negative answers to both questions 7 and 8. The reasoning for this approach is that positive answers to questions 1 through 6 may include TIA, whereas positive answers to questions 7 and 8 likely assess manifest stroke.
Finally, the relationship between correct self-report of stroke and demographic and clinical characteristics potentially influencing accurate self-report of stroke, such as age, ethnic group, cognitive function, memory function, language function, educational level, or cardiovascular risk factors, was assessed using logistic regression models. All analyses were first performed as crude analyses and subsequently stratified by median age (80.1 years), ethnic group, and educational level. P <.05 was considered statistically significant. All data analysis was performed using SPSS statistical software, version 15.0 (SPSS Inc, Chicago, Illinois).
Of 769 participants with available structural MRI data, 52 (6.8%) met diagnostic criteria for dementia at the clinical evaluation closest to the neuroimaging study. These individuals were excluded from the current analyses, leaving 717 subjects in the final analytic sample. The characteristics of this sample are shown in Table 1. In the total MRI sample, there were 484 women (67.5%), the mean (SD) age was 80.1 (5.5) years, and 22.2% had a history of diabetes mellitus, 11.7% a history of myocardial infarction, and 66.5% a history of hypertension. Eighty-five persons (11.9%) reported having had a stroke, and 86 [12.0%]) were current smokers. On the MRI, a stroke was observed for 225 persons (31.4%), and, among these, 68 persons (30.2%) had a large infarct and 186 persons (82.7%) had a small infarct; 29 (4.0%) had both large and small infarcts. Individuals with a stroke on MRI but who failed to report a history of stroke were more likely to be women and older and were less likely to be African American than persons reporting stroke correctly.
Sensitivity and specificity of stroke self-report for a diagnosis of stroke on MRI were low for stroke in any vascular territory: sensitivity, 32.4%; specificity, 78.9% (Table 2). The corresponding positive and negative predictive values were 41.5% and 71.6%. Among the 225 persons who had a brain infarct on MRI, 73 (32.4%) reported correctly having had a stroke, whereas 152 (67.6%) underreported stroke. When the analyses were stratified by stroke size, sensitivity of stroke self-report was better for large than for small strokes (51.5% vs 28.5%) and for cortical vs subcortical infarcts (40.0% vs 33.1%). There were no significant differences in sensitivity or specificity between right or left hemispheric infarcts. In analyses stratified by ethnic group, there was a higher sensitivity of stroke self-report among African Americans than whites or Hispanics that was close to statistical significance; the exception is cortical strokes, for which white participants' reports were slightly more accurate (Table 2; P for differences in sensitivity of stroke self-report for all vascular territories between ethnic groups, .08). In analyses stratified by the median age, sensitivity and specificity of self-report were significantly better in the younger than the older age group (P for differences in sensitivity of stroke self-report for all vascular territories between ages, .02). Exclusion of questions that specifically included TIA (questions 1-3) increased specificity slightly but did not affect sensitivity (specificity for all vascular territories, 82.5%). Restriction of stroke self-report to a positive answer to question 7 or 8 but negative answers to questions 1 through 6 led to a decrease in sensitivity (for all vascular territories, 14.1%) and an increase in specificity (for all vascular territories, 95.6%). Restriction of self-report to a positive response to questions 1 through 6, along with negative responses to questions 7 and 8, yielded similar results (Table 2), suggesting that we captured true stroke rather than TIA when using all 8 questions.
In analyses relating demographic and clinical characteristics with accuracy of stroke self-report, memory, cognitive, and language function were inversely related with underreporting of stroke (Table 3). Presence of hypertension or myocardial infarction increased the frequency of false-negative reporting. These relationships did not change in models adjusting for age, sex, educational level, and ethnic group. There was no association between sex, educational level, or smoking and accuracy of self-report of stroke.
In this study, sensitivity of stroke self-report for a diagnosis of stroke on MRI was 32.4% for the total sample, and specificity was 78.9%. The corresponding positive and negative predictive values were 41.5% and 71.6%, respectively. When the analyses were stratified by stroke size or location, sensitivity of self-report was better for large than for small strokes, better for cortical than subcortical infarcts, and highest for strokes in the middle cerebral artery territory. In analyses stratified by age, sensitivity and specificity of self-report were significantly higher in younger than older persons. In analyses stratified by ethnic group, sensitivity was slightly higher among African Americans than whites or Hispanics. Lower-functioning memory, cognitive, or language ability or presence of hypertension or myocardial infarction were associated with an increased frequency of false-negative reports. Exclusion of self-report questions 1 through 3, which included TIA, led to a slight increase in specificity. Restriction of stroke self-report to a positive answer to questions 7 and 8 but negative answers to questions 1 through 6 led to a decrease in sensitivity and an increase in specificity.
Few studies have assessed the validity of stroke self-report; all were performed among white participants, and most had only the ability to assess positive predictive value but not sensitivity.1,4- 9,18- 20 Most studies reported sensitivity estimates or positive predictive values that were higher than ours. In the Tromsø Study,20 self-reported history of stroke had a positive predictive value of 79% and a sensitivity of approximately 80%. A population-based study from Rotterdam,4 the Italian Longitudinal Study on Aging,19 the American National Health Survey,3 and an American prospective study of nurses1 found positive predictive values of self-reported stroke of approximately 66%. In the Copenhagen stroke study,8 the true positive predictive value of self-reported stroke via questionnaire was 50% and thus lower than in our study.
A potential explanation for the differences in false-positive or false-negative rates among studies are differences in validation of stroke. Most of the previous studies did not use brain imaging but rather neurological examination or review of medical records to confirm stroke.1,3,5- 8 Although stroke is mainly a clinical diagnosis, brain imaging data facilitate the validation of stroke, particularly among patients whose symptoms are ambiguous or subtle. Some studies used cerebral computed tomography, which is less sensitive to subtle cerebrovascular lesions than MRI.21,22 One of the studies that included computed tomographic assessment used imaging solely to confirm stroke among persons who responded affirmatively when asked about their history of stroke but had an unclear diagnosis on medical records or neurological examination. The study did not reconfirm negative responses on stroke self-report.4 The use of imaging methods that are not able to detect subtle lesions and the omission of scans for persons without a history of stroke can lead to a potential misclassification of patients with subtle symptoms or silent strokes. This, in turn, can lead to an increase in the false-negative rate and a higher estimation of positive predictive value and sensitivity estimates compared with our study. The failure to separate TIA from stroke in studies lacking (sensitive) brain imaging is a common contributor to the false-positive rate for self-reported stroke.
Another explanation for the inconsistencies between studies is the difference in factors that can influence accurate report of stroke in the study populations. The mean age of our population was approximately 15 to 20 years higher than that of most previous studies assessing self-report of stroke.3- 5,7,8,20 It is likely that unreported strokes in survey populations occur mainly among elderly participants,6 who are also more likely to have strokes. Approximately 75% of strokes occur in persons older than 65 years.23 Our findings of an inverse association between cognitive ability, memory function, and language function suggest that impaired cognition contributes to a higher false-negative rate of self-reported stroke in the elderly compared with younger persons. Among the elderly, recall problems when asked about prior events would lead to lower estimates of sensitivity and positive predictive value. However, it is likely that some of the strokes that were not reported were silent strokes and, thus, are truly negative. In their review, Vermeer et al24 summarized the results of 105 original articles that provided data on frequency, risk factors, or consequences of silent brain infarcts detected by MRI in various adult populations. According to these studies, silent strokes have an age-dependent frequency of 8% to 28% in the general adult population and are 5 times more frequent than clinical strokes. In the large, population-based Cardiovascular Health Study, 89% of lacunar strokes were silent but associated with neurological symptoms.25 In the Rotterdam Study, 20% of the population had silent infarcts, whereas 2.4% had symptomatic infarcts and 1.5% had both.26
In our study, sensitivity was slightly higher among African Americans than whites or Hispanics. A possible explanation for this observation is that prevalence and incidence of vascular risk factors and cerebrovascular disease are higher among African Americans.27 It is likely that individuals who have had previous contact with health services or physicians during which vascular risk factors for stroke were discussed or who have more contact with persons who had a stroke are more aware of the relevant signs and symptoms.
Our study has important strengths. High-resolution quantitative MRI was available for all participants, independent of affirmative or negative self-report. This allowed us to determine not only positive predictive values but also the sensitivity, specificity, and negative predictive values of self-report. The use of a well-characterized multiethnic cohort allowed us to determine the validity of self-report among a diverse group of older adults whose susceptibility to vascular disease is widely variable. Finally, the neuropsychological test battery used allowed us to assess several cognitive domains and take into account the impact of these functions on accuracy of self-report.
An important consideration in the interpretation of our results is that the study was based on survivors who were able to undergo MRI and that it was conducted in an urban elderly population with a high prevalence of risk factors for mortality and vascular disease. Therefore, our results, including positive and negative predictive values, may not be generalizable to cohorts with younger individuals or a lower morbidity burden. Also, despite using highly sensitive MRI, it is possible that we missed subtle vascular lesions, and that true sensitivity and positive predictive values are slightly lower than reported.
Our results indicate that sensitivity and specificity of stroke self-report are low when using MRI scans as validation. It further suggests that accuracy of self-report is influenced by ethnicity, age, cognitive function, and diagnosis of vascular disease. In stroke research, sensitive neuroimaging techniques rather than stroke self-report should be used to determine stroke history.
Correspondence: Christiane Reitz, MD, PhD, Gertrude H. Sergievsky Center and Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University, 630 W 168th St, New York, NY 10032 (email@example.com).
Accepted for Publication: August 21, 2008.
Published Online: May 11, 2009 (doi:10.1001/archneurol.2009.83).
Author Contributions: Drs Reitz, Schupf, Luchsinger, Andrews, Tang, DeCarli, and Mayeux had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Reitz, Schupf, Luchsinger, Manly, Brickman, Tang, DeCarli, Brown, and Mayeux. Acquisition of data: Reitz, Schupf, Luchsinger, Manly, DeCarli, Brown, and Mayeux. Analysis and interpretation of data: Reitz, Manly, Schupf, Luchsinger, Brickman, DeCarli, Brown, and Mayeux. Drafting of the manuscript: Reitz, Schupf, Luchsinger, Brickman, DeCarli, and Mayeux. Critical revision of the manuscript for important intellectual content: Reitz, Schupf, Luchsinger, Brickman, Manly, DeCarli, Brown, and Mayeux. Statistical analysis: Reitz, Brickman, Schupf, and Tang. Obtained funding: Mayeux and Brown. Administrative, technical, and material support: Schupf, Manly, Andrews, Mayeux, and DeCarli. Study supervision: Mayeux.
Financial Disclosure: None reported.
Funding/Support: This study was supported by grants AG007232 and AG029949 from the National Institutes of Health.