Receiver operating characteristic curves for the Telephone Interview for Cognitive Status (TICS) and Dementia Questionnaire (DQ) for diagnosis of dementia vs nondemented (normal cognition and mild cognitive impairment [MCI] combined) (A); cognitive impairment (dementia and MCI combined) vs normal cognition (B); MCI vs normal cognition, with demented participants eliminated from the analysis (C); and dementia vs MCI, with participants with normal cognition removed from the analysis (D).
Receiver operating characteristic curves for the optimal prediction model for (A) dementia vs nondemented (normal cognition and mild cognitive impairment [MCI] combined) (A); cognitive impairment (dementia and MCI combined) vs normal cognition (B); MCI vs normal cognition, with demented participants eliminated from the analysis (C); and dementia vs MCI, with participants with normal cognition removed from the analysis (D). Pretest probabilities for each model are predicted values from binary logistic regression models using age, sex, race/ethnicity, years of education, and the prior assessment delayed word-list recall score from the Selective Reminding Test as predictors of the 4 target diagnostic states. DQ indicates Dementia Questionnaire; TICS, Telephone Interview for Cognitive Status.
Customize your JAMA Network experience by selecting one or more topics from the list below.
Manly JJ, Schupf N, Stern Y, Brickman AM, Tang M, Mayeux R. Telephone-Based Identification of Mild Cognitive Impairment and Dementia in a Multicultural Cohort. Arch Neurol. 2011;68(5):607–614. doi:10.1001/archneurol.2011.88
Telephone-based interviews can be used for screening and to obtain key study outcomes when participants in longitudinal studies die or cannot be seen in person, but must be validated among ethnically and educationally diverse people.
To determine the accuracy of a telephone interview in classifying (1) demented from nondemented participants, (2) cognitively impaired participants from cognitively normal participants, and (3) participants with mild cognitive impairment (MCI) from those with normal cognition or (4) MCI from dementia among an ethnically and educationally diverse community-based sample.
The sample consisted of 377 (30.5% non-Hispanic white, 34.7% non-Hispanic black, and 33.7% Caribbean Hispanic) older adults. The validation standard was diagnosis of dementia and MCI based on in-person evaluation. The Telephone Interview for Cognitive Status (TICS) and the Dementia Questionnaire (DQ) were administered within the same assessment wave.
The sample included 256 people (67.9%) with normal cognition, 68 (18.0%) with MCI, and 53 (14.1%) with dementia. Validity of the TICS was comparable among non-Hispanic whites, non-Hispanic blacks, and Hispanics. Among non-Hispanic whites, the DQ had better discrimination of those with dementia from those without dementia and from those with MCI than among other racial/ethnic groups. Telephone measures discriminated best when used to differentiate demented from nondemented participants (88% sensitivity and 87% specificity for the TICS; 66% sensitivity and 89% specificity for DQ) and when used to differentiate cognitively normal participants from those with cognitive impairment (ie, MCI and dementia combined; 73% sensitivity and 77% specificity for the TICS; 49% sensitivity and 82% specificity for DQ). When demographics and prior memory test performance were used to calculate pretest probability, consideration of the telephone measures significantly improved diagnostic validity.
The TICS has high diagnostic validity for identification of dementia among ethnically diverse older adults, especially when supported by the DQ and prior visit data. However, telephone interview data were unable to reliably distinguish MCI from normal cognition.
Telephone-based assessment of cognitive status and functional decline is an alternative to in-person assessment in longitudinal studies of cognitive function and dementia of older adults.1-9 A telephone interview has become the primary modality of cognitive data collection in several epidemiologic studies10-13 and is now frequently used as a screen for clinical trials requiring participants with cognitive impairment.14-18
A recent study at Mayo Clinic19 found that although the modified Telephone Interview for Cognitive Status (TICS) had 83.3% sensitivity and 81.6% specificity for separating demented from nondemented participants, and 83.3% sensitivity and 78.3% specificity for separating cognitively impaired participants from cognitively normal participants, the measure could not reliably distinguish participants with mild cognitive impairment (MCI) from those with normal cognition or MCI from dementia. One limitation of this study was that the participants were almost exclusively white and well educated. Therefore, a central goal of the current study was to determine the accuracy of a telephone interview in classifying these groups among an ethnically and educationally diverse community-based sample.
The primary reason for adding the telephone interview to the assessment battery in our study was to be able to derive key diagnostic classifications, ie, normal cognition, MCI, or dementia, from the telephone-based data even when participants are unable or unwilling to be seen in person at a follow-up visit. However, this would require the instruments to validly distinguish MCI from normal cognition and MCI from dementia. This has been a challenge in prior studies; although several research groups have documented high specificity for MCI using telephone-based measures,14,17,19 only 1 study18 showed high sensitivity for distinguishing MCI from normal cognition. Because participants in our study were seen in person at a prior assessment wave, we sought to determine whether the distinction of MCI from other classifications would improve if prior visit data were used along with data from the telephone interview.
The Columbia University institutional review board approved this project. All individuals discussed the study with trained research staff and provided written informed consent.
The current sample comprised 377 English- and Spanish-speaking participants in a longitudinal study of aging, cognitive function, and dementia among Medicare-eligible older adults residing in neighborhoods in northern Manhattan, New York. The current sample was drawn from a cohort resulting from 2 recruitment efforts, the first in 1992 (n = 2125) and the other in 1999 (n = 2183). The sampling strategies and recruitment outcomes of these 2 cohorts are detailed in prior publications.20,21 Reevaluations occur during follow-up waves that are spaced approximately 18 to 30 months apart.
Ethnic group was determined by self-report using the format of the 2000 US Census. Participants were first asked to report their race (ie, American Indian/Alaska native, Asian, native Hawaiian or other Pacific Islander, black or African American, or white), then, in a second question, were asked whether they were Hispanic.
Evaluations were conducted in either English or Spanish, on the basis of the participant's opinion of which language would yield the best performance. Examiners were balanced bilinguals, who spoke both English and Spanish daily with friends, family, and colleagues.
The validation study for the telephone interview was initiated during the 2005–2007 assessment wave of the cohort. The telephone interview was conducted by trained interviewers during the same assessment wave but independently from the in-person visit. On average, calls occurred 7.3 months after the in-person visit, with an SD of 10.9 months. One participant had only the TICS because the call was interrupted and the participant could not be recontacted. Of the participants for whom the Dementia Questionnaire (DQ) interview was conducted, 8 did not have the TICS because they were not well enough to come to the telephone (n = 4), the participant died soon after the in-person visit (n = 2), or the call was interrupted (n = 2).
The TICS was administered and scored in accordance with published procedures.4 The TICS is modeled after the Mini-Mental State Examination, producing scores ranging from 0 to 41. High test-retest reliability1,6,7 has been demonstrated in several studies. The published Spanish-language adaptation of the TICS was used among Spanish-speaking participants.2 Total score was used in the analyses.
The DQ is a semi-structured interview that includes yes-or-no questions assessing cognitive complaints in the domains of memory, confusion, and spatial orientation (8 questions) and language/verbal expression (3 items), as well as questions assessing problems with daily function (6 items). This questionnaire has established reliability and validity with high sensitivity and specificity for the detection of dementia and Alzheimer disease.3 Information about cognitive complaints and functional abilities could be provided by either the participant or an informant, as long as they were knowledgeable about the functional status and medical history of the participant. The 17 questions already mentioned were summed to create a score representing total burden of cognitive complaints and functional problems.
Medical history was recorded and neurologic and physical examinations were performed at the initial visit and each follow-up. A medical burden score was calculated as a sum of multiple nonpsychiatric medical conditions; it included hypertension, diabetes mellitus, heart disease, stroke, arthritis, chronic obstructive pulmonary disease or other pulmonary conditions, thyroid disease, liver disease, renal insufficiency, peptic ulcer disease, peripheral vascular disease, cancer, Parkinson disease, multiple sclerosis, and essential tremor. Current depressive symptoms were assessed using the Center for Epidemiologic Studies Depression Scale.22 The Disability and Functional Limitations Scale23,24 was used to assess instrumental activities of daily living via self and informant report, as well as perceived difficulty with memory.
Neuropsychological measures included the Buschke Selective Reminding Test (SRT),25 matching and delayed recognition conditions of a multiple-choice version of the Benton Visual Retention Test,26 the Rosen Drawing Test,27 a 15-item Boston Naming Test,28 the Controlled Oral Word Association Test,29 the Category Fluency Test,30 the Color Trails Test,31 and the Similarities subtest from the Wechsler Adult Intelligence Scale–Revised.32
After each clinical assessment, a group of physicians and neuropsychologists reviewed the functional, medical, neurologic, psychiatric, and neuropsychological data (but were blinded to TICS and DQ data) and reached a consensus regarding the presence or absence of dementia using criteria from the Diagnostic and Statistical Manual of Mental Disorders (Third Edition Revised).33 For follow-up evaluations, this group was shielded from the prior consensus diagnoses. If dementia was diagnosed, the etiology was determined using published research criteria for probable and possible Alzheimer disease,34 vascular dementia,35 Lewy body dementia,36 and other dementias. Mild cognitive impairment was not diagnosed in the consensus conference but was retrospectively applied on the basis of the neuropsychological, functional, and memory complaint measures previously described using standard criteria37 among participants not diagnosed with dementia at the consensus conference.20
Characteristics of the 3 diagnostic groups were compared using χ2 tests and analysis of variance, and correlations between measures and demographic variables were calculated. Receiver operating characteristic (ROC) curves were drawn for each of the telephone measures administered (TICS total score and DQ summary score), using 4 planned comparisons of interest: (1) nondemented vs demented, (2) cognitively normal vs cognitively impaired (ie, MCI and dementia), (3) normal vs MCI, and (4) MCI vs dementia. Areas under the curve were calculated and compared for all participants and separately for non-Hispanic whites, non-Hispanic blacks, and Hispanics. The cutoff yielding the best sensitivity and specificity for the overall sample was determined, and the sensitivity and specificity, negative and positive predictive values, and likelihood ratios were calculated38 for each telephone instrument for prediction of each of these diagnostic criteria in the entire sample and within the 3 primary racial/ethnic groups.
Binary logistic regression models were constructed to estimate the pretest probabilities of each of the 4 comparisons previously mentioned, with the in-person diagnostic classification as the dependent variable and the following independent variables: age, sex, race/ethnicity, years of education, and the delayed word-list recall score from the SRT taken from the prior assessment. Posttest probabilities for each participant were then estimated for 2 scenarios: (1) when both TICS and DQ were available, by adding the total scores from both measures to the model; and (2) when the participant was dead or too ill to come to the telephone and the TICS was not available, by adding to the model the total score from the DQ only. Predicted values from the models were then used to generate ROC curves. Areas under these curves were compared, and the differences were calculated with 95% confidence intervals.39
Demographic characteristics and scores on key study measures of participants with normal cognition, MCI, and dementia are described in Table 1. Most (87.3%) Hispanic participants in this study were immigrants from the Caribbean, including the Dominican Republic (59.5%), Cuba (17.5%), and Puerto Rico (10.3%). The MCI group comprised 68 participants of whom 44 were participants with MCI with memory impairment (64.7%) and 24 with MCI without memory impairment (35.3%). Of the 53 demented participants, most were diagnosed with probable (n = 33) or possible (n = 16) Alzheimer disease, but the sample also included 2 people with Parkinson disease dementia, a participant with vascular dementia, and 1 with a diagnosis of Lewy body disease. The TICS total score was significantly correlated with age (r = −0.37; P < .001), years of education (r = 0.51; P < .001), prior visit SRT total recall score (r = 0.56; P < .001), prior visit SRT delayed recall score (r = 0.48; P < .001), and depressive symptoms (r = −0.28; P < .001). The mean (SD) score on the TICS for men (28.1 [0.9]) was slightly higher than that for women (25.6 [0.6]) (F1,369 = 6.1, P = .01). Differences between mean (SD) scores of non-Hispanic whites (30.4 [7.5]), non-Hispanic blacks (27.4 [7.0]), and Hispanics (21.8 [10.2]) were all significant (omnibus F1,365 = 32.3, P < .001; all pairwise comparisons P < .05). The DQ summary score was significantly related to age (r = 0.17; P < .001), years of education (r = −0.3; P < .001), prior visit SRT total recall score (r = −0.32; P < .001), prior visit SRT delayed recall score (r = −0.26; P < .001), depressive symptoms (r = 0.39; P < .001), and medical burden score (r = 0.21; P < .001). There were no significant differences in mean (SD) DQ summary score between men (3.6 [0.4]) and women (4.1 [0.2]) (F1,376 = 1.1, P = .29). Although the mean DQ summary score (SD) of whites (3.1 [2.9]) and blacks (3.2 [2.9]) did not differ from each other, Hispanics (5.3 [4.5]) reported more problems on the DQ than each of the other 2 groups (F1,372 = 15.8, P < .001). The TICS and DQ scores were significantly correlated with each other (r = −0.59; P < .001).
Figure 1A shows the ROC curve for separation of demented vs nondemented (ie, either normal cognition or MCI) participants. The area under the curve (AUC), diagnostic characteristics, and optimal cutoffs derived from the ROC analyses for the TICS (Table 2) and the DQ (Table 3) are shown for each of the comparisons among all participants and then separately for non-Hispanic whites, non-Hispanic blacks, and Hispanics. The ability of the TICS to discriminate between demented and nondemented participants was comparable across racial/ethnic groups, but the DQ's discrimination was higher among non-Hispanic whites than among racial/ethnic minorities. Figure 1B depicts the ROC curves when MCI participants were combined with the demented participants to form a cognitively impaired group and then compared with participants with normal cognition. The AUCs for the TICS and the DQ were comparable across ethnic groups for this comparison (Tables 2 and 3).
We sought to determine the diagnostic accuracy of the telephone-based measures when making more subtle distinctions between participants with normal cognition and MCI and between MCI and dementia. Figure 1C depicts the ROC curves when demented participants were eliminated from the analysis and participants with MCI and normal cognition were compared. The AUC for both measures for this comparison was relatively low (0.71 for the TICS and 0.58 for the DQ), but discriminability was similar across ethnic groups (Tables 2 and 3). We then determined the ability of the instruments to distinguish people with MCI from those with dementia, when participants with normal cognition were omitted from the analysis (Figure 1D). For this comparison, the AUC was 0.91 for the TICS, and discriminability was comparable across ethnic groups. The AUC was 0.81 for the DQ in the whole sample, but for this comparison, the DQ had better discrimination among non-Hispanic whites than among ethnic minorities. Examination of the odds ratios in Tables 2 and 3 reveals that both the TICS and DQ performed best in distinguishing nondemented (normal cognition and MCI combined) from demented participants, and in distinguishing people with MCI from people with dementia.
Figure 2 depicts ROC curves for pretest and posttest probabilities as calculated in the logistic regression models. As shown in Table 4, adding information gathered from the TICS and DQ to the pretest model significantly improved the diagnostic performance for all key clinical outcomes in the study. For example, the addition of the TICS and DQ to the pretest prediction of dementia vs no dementia improved the AUC by 6.5%. The best diagnostic performance was in distinguishing nondemented from demented participants (AUC, 0.96) and MCI from demented participants (AUC, 0.95) using demographic information, prior SRT delayed memory score, and both TICS and DQ. Accurate identification of MCI among nondemented participants was poor overall, even when both TICS and DQ were available (AUC, 0.75). Predicted classification as normal cognition, MCI, and dementia, using the optimal cutoffs for the predicted values from the models separating demented from nondemented participants and normal cognition from cognitive impairment, was compared with the observed diagnoses. The cutoffs correctly identified 66.5% of participants with normal cognition, 55.4% of those with MCI, and 92.7% of those with dementia.
Models using only the DQ summary score showed improved classification over pretest probabilities when the goal was to distinguish demented from nondemented people, and cognitively impaired from cognitively normal people. However, addition of the DQ summary score did not improve diagnostic accuracy over pretest probabilities when the goal was to identify MCI among nondemented participants or to identify dementia among cognitively impaired participants (Table 4).
The sensitivity and specificity of the TICS and DQ was variable and depended on the diagnostic groups serving as the standard for comparison. There were no consistent racial/ethnic differences in the ability of the TICS to discriminate diagnostic classifications. However, in distinguishing demented people from nondemented (normal cognition and MCI combined), and people with dementia from people with MCI, the DQ performed better among non-Hispanic whites than among non-Hispanic blacks and Hispanics.
Used alone, the TICS had high sensitivity for distinguishing demented from nondemented participants (normal cognition and MCI combined), and excellent specificity when distinguishing people with dementia from people with MCI. The DQ had lower sensitivity and higher specificity than the TICS for all comparisons but was most valid when distinguishing demented people from those with MCI. The superior specificity of the DQ to the TICS was expected, given the original purpose of developing the instruments: the DQ was designed to pick up on changes in memory and function that are specific to dementia and are not seen in normal aging or MCI.
Comparing likelihood ratios with those of the recent Mayo clinic study by Knopman et al,19 our use of the TICS overall and within each racial/ethnic group yielded superior performance to the Modified TICS when distinguishing demented from nondemented participants (MCI and normal cognition combined), and MCI from dementia. Identification of cognitive impairment (MCI and dementia) from normal cognition was comparable with the Mayo Modified TICS among non-Hispanic whites, non-Hispanic blacks, and Hispanics. Although among Hispanics it was similar to our study, the Mayo study showed better accuracy than our study did for the separation of MCI from normal cognition than among the whites and blacks. Both studies had lower diagnostic validity for identification of MCI vs normal cognition than was reported by Cook et al18 in a study of mostly white, community-dwelling nondemented older adults.
Our standard diagnostic algorithms for dementia and MCI require an in-person visit; therefore, diagnoses could not be derived for participants not seen because of death, moving out of the area, or refusal. The current study revealed that if a participant and/or informant can be reached by telephone, presence of dementia can be estimated with moderate to high validity among non-Hispanic whites, non-Hispanic blacks, and Hispanics. Our analyses indicate that the diagnostic utility of telephone instruments will increase or decrease in response to variations in the prevalence of cognitive impairment and dementia in the population, and thus it will vary in cohorts that differ by age, race, ethnic group, educational level, and other demographic variables. Indeed, we found that adding age, sex, years of education, race/ethnicity, and prior memory scores to the data supplied by the TICS and DQ significantly improved the diagnostic accuracy of the telephone interview data. Even when the TICS was not used in the model, the nondemented/demented classification was highly accurate as predicted by demographics, DQ data, and prior visit data.
It was hoped that the availability of the DQ, which taps into cognitive complaints and functional status, and prior visit memory test performance, when added to the direct cognitive assessment provided by the TICS, would improve the identification of MCI among nondemented participants. However, none of the models tested were able to reliably differentiate MCI from normal cognition—this was true across all racial/ethnic groups. Addition of a delayed word list recall to the TICS may marginally improve identification of MCI, but prior research suggests that even with this added component, the classification rate may remain too low to advocate for the use of the Modified TICS as a stand-alone measure for identification of MCI.19
Correspondence: Jennifer J. Manly, PhD, Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Medical Center, 630 W 168th St, P&S Box 16, New York, NY 10032 (firstname.lastname@example.org).
Accepted for Publication: September 1, 2010.
Author Contributions: All authors had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Manly, Schupf, Stern, Brickman, and Mayeux. Acquisition of data: Manly and Schupf. Analysis and interpretation of data: Manly, Tang, and Mayeux. Drafting of the manuscript: Manly, Brickman, and Mayeux. Critical revision of the manuscript for important intellectual content: Manly, Schupf, Stern, and Tang. Statistical analysis: Manly, Schupf, and Tang. Obtained funding: Manly and Mayeux. Administrative, technical, and material support: Manly, Stern, and Mayeux.
Financial Disclosure: Dr Schupf serves as a consultant to Al Janssen.
Funding/Support: This study was supported by grants P01-AG07232 (Dr Mayeux) and R01-AG16206 (Dr Manly) from the National Institute on Aging.