Progression of functional change over 3 years among individuals who met the criteria for questionable Alzheimer disease at baseline. One patient died prior to the first follow-up assessment and was not included in the analyses.
Daly E, Zaitchik D, Copeland M, Schmahmann J, Gunther J, Albert M. Predicting Conversion to Alzheimer Disease Using Standardized Clinical Information. Arch Neurol. 2000;57(5):675-680. doi:10.1001/archneur.57.5.675
Copyright 2000 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2000
To identify aspects of a standardized clinical assessment that can predict which individuals within the category of "questionable" Alzheimer disease (AD) have a high likelihood of converting to AD over time.
Detailed semistructured interviews were performed at baseline and annually for 3 years.
University-based gerontology research program.
The patient population consisted of 165 individuals 65 years and older: 42 of the participants had a Clinical Dementia Rating (CDR) of normal cognition (CDR rating, 0.0) and 123 had a rating of questionable AD (CDR rating, 0.5). After 3 years of follow-up, 23 of the 123 subjects with questionable AD were diagnosed with probable AD.
Main Outcome Measures
The interview was used to generate a summary measure based on the sum of 6 CDR categories, known as the Total Box Score. The responses to 32 selected questions from the interview also were examined.
Likelihood of progression to AD during the follow-up period was strongly related to the Total Box Score. For example, more than 50% of individuals with a Total Box Score of 2.0 or higher at baseline developed AD during the follow-up interval, whereas about 10% of individuals with a Total Box Score of 1.0 or lower developed AD during this same period. Selected questions from the standardized clinical interview also were highly predictive of subsequent conversion to AD among the study population. Eight selected questions from the clinical interview at baseline, combined with the CDR Total Box Score, identified 88.6% of such individuals accurately (questionable group, 82/91; converter group, 19/23).
A standardized clinical assessment can be used to identify the subgroup of individuals within the category of questionable AD who have a high likelihood of converting to AD over time. Subjects who met the criteria for questionable AD had a variety of trajectories during a 3-year follow-up, suggesting that diverse factors may influence the functional changes observed in this population.
PATIENTS WITH memory complaints are often arrayed along a continuum from normal function through varying degrees of impairment. A crucial question for patients and clinicians evaluating these patients is to what extent these complaints are the harbinger of Alzheimer disease (AD). Two scales are commonly used for staging AD1,2 and both include ratings for individuals who are considered questionable (ie, have progressive difficulty with cognitive function but do not meet clinical research criteria for AD). Several research groups have initiated longitudinal studies in which individuals meeting criteria for this questionable stage have been recruited and followed up to determine preclinical predictors of AD. However, the longitudinal outcome of persons rated as questionable for AD remains unclear.
One of the most striking findings to emerge from these studies is that the proportion of individuals who develop AD over time varies. For example, after an average follow-up of 2 years, the rates of conversion to AD reported in the literature vary from 80%3 to 66%4 to 36%5 to 24%.6,7 This variability could be occurring for several reasons: (1) the group of individuals with evidence of recent and progressive difficulties in memory may be heterogeneous in nature; (2) the characteristics of the populations under study may be based on different selection criteria; and (3) the criteria for conversion to AD may be applied in differing ways. To interpret the meaning of predictors of conversion to AD or drug treatments that attempt to alter the rate of conversion, it is important to understand this variability.
We have recruited and followed up 2 groups of subjects from the general community. At baseline, 42 subjects had a Clinical Dementia Rating (CDR)1,8 suggesting normal cognition (CDR rating, 0.0) and 123 met the criteria for questionable AD at baseline (CDR rating, 0.5). Information from the semistructured interview was useful in predicting the 3-year trajectory of change. Moreover, the status of these individuals after 3 years of annual follow-up emphasizes the heterogeneity that exists among subjects who are rated as questionable.
The subjects in the study consisted of 165 of 1095 elderly individuals from the general population. Participants were recruited through the print media (rather than from a clinic or other medical referral source) and underwent a multistage screening procedure. To be included in the study, participants needed to be more than 65 years old; to be free of significant underlying medical, neurologic, or psychiatric illness; to have a CDR rating no higher than 0.5; and to be willing to participate in the study procedures. All subjects provided informed consent prior to the initiation of the study.
At baseline, the subjects were divided into 2 groups based on their functional status. One group consisted of 42 subjects with normal cognition (CDR rating, 0.0) and the other group consisted of 123 subjects with questionable AD (CDR rating, 0.5). They had a mean age of 71.3 and 72.2 years, respectively. The educational level of the 2 groups was equivalent (14.4 and 14.9 years, respectively), as was the mean Mini–Mental State Examination9 score (29.3 and 29.1, respectively). The sex distribution within both groups also was similar; approximately 60% female and 40% male.
After 3 years of follow-up, 9 subjects had died, one of whom died prior to the first follow-up assessment and is not included in these analyses. For the remaining subjects, the annual follow-up rate was 99%.
The subjects were categorized into 5 groups based on their 3-year trajectory of functional change:
Normal group: These subjects had normal cognition at baseline (CDR rating, 0.0) and continued to be categorized as normal at follow-up (n=32). This group represented 76% of the normal subjects. (Ten of the 42 subjects with a CDR rating of 0.0 at baseline were categorized as questionable after 3 years of follow-up.)
Questionable group: These subjects met the criteria for questionable AD at baseline (CDR rating, 0.5) and still had a CDR rating of 0.5 after 3 years of follow-up (n=91). This group represented 73% of the subjects with questionable AD.
Converter group: These subjects met the CDR criteria for questionable AD at baseline but progressed to the point where they had a CDR rating of 1.0 within 3 years of follow-up and met the NINCDS/ADRDA (National Institute of Neurological and Communicative Disorders and Stroke/Alzheimer's Disease and Related Disorders Association) criteria for probable AD10 (n=23). (The annual medical, neurologic, psychiatric, and laboratory evaluation was augmented as needed to assure that the subjects met these criteria.) This group represented 19% of the subjects with questionable AD. The subjects who converted to AD at follow-up were slightly older than those who did not (73.0 and 71.9 years, respectively). The proportion of men also was slightly greater among those who were diagnosed with AD at follow-up compared with those who remained questionable (52% [n=12] vs 48% [n=11]). Neither of these group differences was statistically significant (P=.62 and P=.32, respectively). Two subjects who converted to AD died, and an autopsy was performed on one of them, which confirmed a diagnosis of definite AD.11
Non-AD group: These subjects were categorized as questionable at baseline and had dementia at follow-up but did not meet the clinical research criteria for probable AD (n=3). These 3 individuals had strokes.
Fluctuater group: These subjects were categorized as questionable AD at baseline but subsequently met the criteria for normal cognition (n=6) or AD (n=10). Other studies4,5 that have followed up subjects with questionable AD have likewise found a small number of individuals who appear to fluctuate in their functional status.
A semistructured interview was used to evaluate the subjects at baseline and at each annual evaluation. The interview was based on the Initial Subject Protocol.1 It includes a brief neurologic, psychiatric, and neuropsychologic examination, in addition to a semistructured set of questions regarding functional status. Each interview, which was administered annually by a skilled clinician (eg, psychiatrist, neuropsychologist, physician assistant), was videotaped and took approximately 1½ to 2 hours to complete.
During the initial phase of the study, it became clear that the interview needed to be adapted for use with a population with very mild impairments. This was because the original interview was devised to rate subjects who spanned a very broad range of cognitive function, from no impairment to severe impairment.12 As the subjects in the present study were either normal or very mildly impaired, many of the existing questions needed to be geared to the possibility that no impairments were present, or if present, very mild. In addition, to improve the reliability of the ratings among the interviewers, it was necessary to delineate how specific responses to questions were to be coded according to the CDR rating system.
The primary questions in the interview were divided into 32 functional groups based on their content (eg, missing appointments, repeating questions or stories, trouble handling a checkbook). A large number of descriptors and specific questions about changes in behavior then were developed, with a focus on those that are common in very mildly impaired individuals. Finally, criteria were established so that CDR ratings could be assigned within the 32 functional areas. The questions that were analyzed in the present study represent the primary question from each of these 32 functional domains.
A consensus conference was held among the members of the research team whenever there was evidence of the following: (1) the subject's ratings crossed CDR categories (eg, the overall CDR rating went from 0.0 to 0.5 or from 0.5 to 1.0), (2) the functional difficulty of the subject decreased, or (3) the functional difficulty of the subject increased substantially. The goal of the consensus conference was to review the symptoms within each CDR category and to assure that the collateral sources of information were optimal and that the final rating adhered as closely as possible to the CDR criteria.
Once the interview was completed and rated, the subjects in the study were administered other study procedures, which included a neuropsychological battery, a magnetic resonance imaging scan, and a single photon emission computed tomographic scan.13 The ratings of the interview were completed with the interviewers blinded to the results of other study procedures. These procedures were repeated for a subset of the participants in subsequent years. Only the interview was performed annually. The neuropsychological battery at baseline, which will be briefly described herein, consisted of 17 tests.
The mean neuropsychological test scores of the normal, questionable, and converter groups are presented herein to permit comparison with studies that use neuropsychological tests as one of the selection criteria for the category of questionable AD or mild cognitive impairment.14 Such studies generally select individuals with a memory performance that is 1.5 SDs from the mean of normal controls. Therefore, we examined the mean performance of the normal, questionable, and converter groups in our study at baseline to determine if they differed by more than 1.5 SDs from the mean of the normal group. Of the 17 tests, only 3 distinguished between the normal and the converter groups at baseline, using the 1.5-SD cutoff. The total learning score on the California Verbal Learning Test15 and the total score on the Self-Ordering Test16 differed by 1.5 SDs between the groups. Time to completion on part B of the Trail Making Test17 differed by more than 2 SDs between the normal and the converter groups. When the normal and the questionable groups were compared with one another, only the total learning score on the California Verbal Learning Test differed by 1 SD between the groups at baseline. No test score differed by more than 1.5 SDs between the normal and the questionable groups at baseline. The test scores that differed between the groups are presented in Table 1.
The reliability of the ratings from the revised semistructured interview was evaluated. Videotaped interviews of 10 subjects were rated by 3 independent interviewers. One third of the subjects whose interviews were evaluated were normal controls (CDR rating, 0.0) and two thirds were subjects with questionable AD (CDR rating, 0.5). The mean intercorrelation coefficient of the overall CDR rating was high (r2=0.99; P<.001).
The reliability of the ratings within each of the 6 CDR subcategories also was examined. The Intraclass Correlation Coefficient-2 (ICC-2) of the 6 individual categories was as follows: Memory, 0.99 (P<.001); Orientation, 0.86 (P<.001); Judgment and Problem Solving, 0.95 (P<.001); Community Affairs, 0.76 (P<.05); and Home and Hobbies, 0.95 (P<.001). Only 2 subjects received a CDR rating of greater than 0.0 (ie, impaired function) for Personal Care; thus, there was no variance among the Personal Care ratings, and the ICC-2 was not calculated. The mean reliability of the ratings was high (r2=0.90).
The ratings across the 6 CDR categories then were added together to create a summary measure, known as the Total Box Score. (This variable has been called the Sum of Boxes by the developers of the CDR scale.) The pattern of change in Total Box Score during the 3-year follow-up then was examined among the individuals who were questionable at baseline (Figure 1).
Based on the Total Box Score, the participants had a variety of trajectories on follow-up. Fifteen percent (n=19) of the subjects improved, as indicated by a decline in Total Box Score. These included the 6 subjects in the group of fluctuaters whose overall CDR rating went from 0.5 to 0.0. An additional 13 subjects across the range of the category of questionable also showed small declines in Total Box Score over time. Twenty-nine percent of the subjects (n=36) had the same Total Box Score at the beginning and end of the follow-up. These included individuals who spanned a broad range of possible Total Box Scores (0.5-2.5). Fifty-five percent of the subjects (n=68) showed progression of functional impairment as reflected by an increase in Total Box Score during the follow-up, including those who converted to AD.
The proportion of individuals who converted to AD at follow-up, based on their Total Box Score at baseline, then was calculated by dividing the number of subjects who converted to AD at each Total Box Score level at baseline by the total number of individuals at that same score level at baseline. The Total Box Score at baseline was 0.5 to 3.0 in the questionable group (n=123); the higher the score, the greater the overall impairment in the individual. As shown below, the proportion "likely to convert" was highest among those individuals with a Total Box Score of 3.0 or higher at baseline (67%). However, among those individuals with a Total Box Score of 0.5 or 1.0, about 10% were likely to develop probable AD within the follow-up interval.
There was a highly significant difference between the likelihood of conversion to AD during the 3 years, using a Total Box Score of 2.0 as a cutoff (χ2=42.8; P<.001).
The ratings for specific questions within the semistructured interview at baseline also were examined to determine which questions at baseline were most predictive of status at follow-up. The rating for the primary question within each of the 32 functional groupings of questions was analyzed by analysis of variance to determine whether the response to each of the 32 questions differentiated among the normal, the questionable, and the converter groups. If an overall difference emerged that was significant at P≤.001 (based on analysis of variance), this was followed by post hoc t tests, using the Scheffé test for correction for multiple comparisons.18 The non-AD and fluctuater groups were omitted from this analysis, since they represented a small number of subjects whose status, with respect to the question of interest, was unclear. Of the 32 questions examined, only 8 questions differed significantly between the normal, questionable, and converter groups. These 8 questions are delineated below.
Judgment and Problem Solving:
Does the subject have increased difficulty handling problems (eg, an increased reliance on others to help solve problems or make plans)?
Is there a change in the pattern of driving not secondary to visual difficulty (eg, increased cautiousness, trouble making decisions)?
Is the subject's judgment as good as before or is there a change?
Is the subject having increased difficulty managing finances (eg, maintaining a checkbook, making complicated financial decisions, paying bills)?
Does the subject have more difficulty handling emergencies (eg, makes unsafe decisions, needs increased cueing)?
Home and Hobbies:
Is the subject having increased difficulty performing household tasks (eg, cooking, learning how to use new appliances)?
Has there been any change in the subject's ability to perform hobbies (eg, decreased participation in complex hobbies, increased difficulty following rules of games, reading less or needing to reread more)?
Does the subject now need prompting to shave or shower?
Five of the questions that significantly discriminated the 3 groups from one another pertained to the category of Judgment and Problem Solving, 2 to Home and Hobbies, and 1 to Personal Care. None of the questions that differentiated the 3 groups from one another pertained to the category of Memory, Orientation, or Community Affairs. For example, questions from the category of Memory significantly differentiated the normal from the questionable group (P<.001) and the normal from the converter group (P<.001) but not the questionable from the converter group (P=.08).
A discriminant function analysis was performed to determine whether a combination of the 8 discriminating questions (described above) could predict with significant accuracy which subjects with questionable AD would progress to develop AD. For this analysis, the score on each of the 8 questions that significantly differentiated the groups was entered into the equation. The overall analysis was highly statistically significant (χ2=69.9; P<.001) but the accuracy with which individual participants were categorized among the 3 groups was modest (ie, 58.3%). An examination of the classification results indicated that most of the errors pertained to differentiating the normal and the questionable groups.
A second discriminant function then was performed comparing only the questionable and the converter groups. This comparison had an accuracy of 79.8% (questionable group, 74/91; converter group, 17/23) (χ2=38.1; P<.001). The questions that contributed most to this discrimination pertained to the categories of Judgment and Problem Solving (eg, difficulty handling problems) and Home and Hobbies (eg, difficulty cooking, learning how to use new appliances). The CDR Total Box Score at baseline then was added to this analysis. This addition increased the accuracy of identification to 88.6% (questionable group, 82/91; converter group, 19/23) and represented a statistically significant improvement in accuracy (F=6.38; P<.001). The Total Box Score at baseline contributed more heavily than any of the other variables to the overall accuracy of the discrimination (F=43.1; P<.001).
In this study, we recruited individuals from the general population with recent and progressive evidence of memory complaints who met the criteria for questionable AD. After a 3-year follow-up, 15% improved, 29% remained the same, and 55% had progressive difficulty with memory, including 19% who converted to AD. We have demonstrated that it is possible to identify, with a considerable degree of accuracy, the subset of individuals among those who are categorized as questionable who will convert to AD during 3 years of follow-up. The CDR Total Box Score at baseline, combined with 8 selected questions from the clinical interview at baseline, identified 88.6% of such individuals accurately in the present study. Moreover, this accuracy was based on a semistructured evaluation that can be applied by a skilled clinician in any clinical setting.
Furthermore, we demonstrated that the likelihood of subjects with questionable AD converting to AD during a 3-year follow-up was strongly related to the level of functional difficulty at baseline, as evaluated by the CDR Total Box Score. This variability in outcome, based on the level of functional difficulty at baseline, in all likelihood accounts for the different rates of conversion to AD that have been reported in the literature. One possible source for differences in baseline functional level across studies may be differences in the procedures for subject selection. In our study, subjects were recruited via the media; referrals from a clinic or other medical source were minimal, and subjects were screened to remove individuals with diseases that could be contributing to memory decline. In contrast, some previously published studies recruited subjects who were examined in a memory disorders clinic and had cognitive problems but did not meet the criteria for dementia3 or subjects who were referred by a health professional because of symptomatic memory problems that affected their daily functioning.4,5 Some studies3,5 also added a cognitive testing criterion based on comparison with selected norms. It seems likely that these selection procedures yielded a large number of individuals in the upper end of the range of the questionable group and thus yielded a higher number of individuals who converted to AD during the follow-up interval than was observed in the present study.
Differences in the way in which the criteria for questionable AD or probable AD are applied may likewise lead to differences in the populations under study. The NINCDS/ADRDA criteria for probable AD require a significant decline in social and occupational function, a decline in memory and at least one other cognitive domain, and an absence of significant medical illness that could cause cognitive decline. In the present study, results of a standardized physical, neurologic, and psychiatric evaluation, in combination with laboratory findings, were used to rule out significant disease that could cause cognitive decline at baseline. Function in daily life (eg, Home and Hobbies, Community Affairs) and in the various cognitive domains (Memory, Orientation, Judgment and Problem Solving, and Language) was assessed by a semistructured interview. The subjects who met the NINCDS/ADRDA criteria for mild AD in the present study generally had a CDR rating of 1.0 (indicating mild impairment) in the areas related to social and occupational function and in areas related to memory and judgment and problem solving, consistent with a diagnosis of probable AD. Although the categorization of the subjects at baseline did not incorporate the neuropsychological test results from the same period, the cognitive test results indicate that the converter group differed by 1.5 SDs from the normal group at baseline on 3 of the 17 tests, only one of which was in the area of memory, suggesting comparable levels of impairment between the converter group in the present study and the questionable group in studies that used neuropsychological test results as one of the criteria for selection (M.A., unpublished data, July 1999). Thus, it appears likely that the subjects who converted to AD in the present study are comparable with the subjects with questionable AD or mild cognitive impairment in studies that have selected such individuals in a different manner.
Those who were placed in the questionable group at baseline and follow-up also are of some interest. These subjects did not differ by 1.5 SDs from the normal group on any of the 17 neuropsychological tests in the battery. This suggests that they had lower levels of cognitive impairment than is typical of subjects with questionable AD in similar investigations, which is consistent with the lower rate of conversion seen in the present study.
During subsequent years of follow-up, we anticipate that more subjects will progress to the point where they meet the criteria for probable AD. Therefore, it is unknown whether eventually all of the subjects who met the criteria for questionable AD at the beginning of the study will convert to AD over time. However, the present data suggest that this will not be the case. For example, a substantial number of individuals either had no change or slightly improved. This finding suggests that some individuals who were categorized as questionable had changes in memory that were not indicative of disease and thus could be considered within the normal range of memory difficulty.
This possibility also is suggested by the types of questions that differentiate subjects with questionable AD who progress to meet the criteria for AD over time from those who do not. For example, none of the questions from the Memory category were useful in differentiating the questionable from the converter groups, suggesting that the types of problems covered by these questions (eg, forgetting appointments, increased use of lists or calendars, increased difficulty with names) may be common among individuals who are going to develop AD and among those who are not going to develop progressive cognitive decline.
However, other types of questions relating to memory difficulty do appear to foreshadow subsequent progressive decline. They pertain to skills that require memory for moment-to-moment events and planning and integration of information. For example, the 2 discriminating questions from the category of Home and Hobbies reflected difficulty with household tasks or hobbies that were previously learned and tasks that require accurate memory for specific steps and moment-to-moment tracking and integration of tasks, all of which need to be completed within a specific time (eg, cooking).
The results of the present study have both practical and theoretical significance. At the practical level, these findings provide guidelines for using a semistructured clinical assessment to identify the subgroup of individuals within the category of questionable AD who have a high likelihood of converting to AD over time. This should help guide recommendations to such individuals concerning currently available medications and the design of intervention trials, which will need an enriched population of those who will convert to AD within a few years. At the theoretical level, these findings emphasize the increasing importance of understanding the boundary between normal aging and the earliest stage of AD.
Accepted for publication June 16, 1999.
This work was supported by grant P01-AG04953 from the National Institute on Aging, Bethesda, Md.
The authors thank Mary Hyde, PhD, for assistance with computer programming and Ken Jones, PhD, John Stein Professor at Brandeis University, for assistance with the data analysis.
Reprints: Marilyn S. Albert, PhD, Massachusetts General Hospital, East (149-9124), 149 13th St, Charlestown, MA 02129 (e-mail: firstname.lastname@example.org).