Association Between High School Personality Phenotype and Dementia 54 Years Later in Results From a National US Sample

This national US cohort study examines whether personality during adolescence—a time when preclinical dementia pathology is unlikely to be present—confers risk of dementia in later life and tests whether associations could be accounted for by health factors in adolescence or differed across socioeconomic status.

This supplementary material has been provided by the authors to give readers additional information about their work.

1A. Alzheimer's Disease and Related Disorders (ADRD) Algorithm
The following ICD-9 codes are used by the Centers for Medicare and Medicaid Services (CMS) to identify ADRD cases: 331 1B. Matching procedures.
In 2016, American Institutes for Research submitted 199,994 unique Project Talent cases for matching with Medicare records and claims data. CMS uses Social Security Number or a combination of last name, date of birth, and sex to match cases to Medicare records. Social Security Numbers were collected during Project Talent follow-up data collections and were available for 137,396 unique cases. Given the limited number of variables used in the matching algorithm, multiple aliases were submitted for unique cases to maximize the match rate.

1C. Matching Results.
CMS returned matches for 145,183 individuals. All matches were reviewed for accuracy by comparing name, address, and demographics in Project Talent records to those provided in the matched Medicare record and 99% were accepted as true matches. American Institutes for Research rejected 1,485 matched cases due to incongruent information across the Project Talent data files and the Medicare records.
Returned data include Medicare Parts A, B, and C data. Parts A and B involve fee-forservice data for which reporting to CMS is mandatory. Part C involves data that is reported only voluntarily, and hence often missing. For this reason, the Research Data Assistance Center recommends that Part C data not be used in analyses 1 . Of the 143,698 accepted matched cases, 82,232 had Part A / B data for all of 2012 and 2013, and had complete Project Talent Base Year data for the variables of interest, and thus were included in the present analysis. In comparing those with Parts A and B vs. Part C data, no differences in personality factors exceeding .1 standard deviation SD were noted. Part C individuals were roughly 3 percent more likely to be female 53.2% vs. 49.9%.

1D. Differences Between Medicare Analytic Sample and Others.
We next examined whether individuals in the analytic sample (Medicare Subsample) differed from those excluded from the sample (Non-Medicare) on demographic and personality factors. Any differences could be due to a wide range of factors, including not being in the sampling frame, death before the beginning of the 3-year index period prior to 2011, or not having Medicare Part A / B data.

2C: Personality Factor Analysis
As a whole the marginal distributions of PTPI scales were symmetric (median skew .08, range -.23 -.9) and somewhat leptokurtotic (median kurtosis 2.27, range 1.85 -3.68), and their joint distribution was approximated by a multivariate normal density (correlations seen in eTable 3, next page). The correlation matrix was subjected to an exploratory factor analysis to determine if one or more broader factors accounted for their intercorrelations. The first and second unrotated Eigenvalues were 4.04 and 0.36, respectively, indicating a dominant first factor. This factor appeared to correspond to the so-called "general personality factor" (GPF), reflecting a collection of related adaptive personality characteristics forming a general dimension. eTable 3 below shows the loadings of the PTPI scale on this factor. Factor scores were estimated via Bartlett's method, and used as a regressor with the same set of covariates as in other models. eTable 4 shows the full regression model of its associations with dementia. Coefficients reflect the association of 1 SD change in the GPF, and the model includes the interaction with SES. The plot of the association between the GPF and dementia risk at -1 SD, mean levels, and +1 SD SES is shown in eFigure 1.

2D: Associations with CMS Alzheimer's Disease Outcome
In addition to the primary outcome, based on the CMS algorithm for Alzheimer's Disease and Related Disorders, we also examined as a secondary outcome the CMS algorithm for Alzheimer's Disease only. This involved receiving an ICD-9 diagnosis of 331.0 on a claim over the three-year index period. A total of 865 cases met this criterion. Results from regression models are presented below in eTable 5:

3A: Selection Models
The Medicare subsample may differ in race and SES from the 1960 baseline probability sample due to early mortality, differential use of Part A or B data (i.e., Part C only), or many other factors that might or might not affect the observed associations. To examine this, we compared relative risk regression models (i.e., Poisson distribution with robust variance estimate 2 ) in the Medicare sample with Heckman selection models in which the selection equation is a probit model and the main regression takes the same relative risk specification. Covariate control is identical to the main models, and selection equation for each model contains SES, race, and the personality trait appearing in the main equation.
Results for the personality traits are presented in the following table, in the form of log relative risks, standard errors, and 95% confidence models in the two different types of models. Minimal changes in estimate occur. Thus, differences between the analytic sample and broader baseline, which primarily involve SES and race, do not appear to lead to substantially different results when considered in selection models.

3B: Diagnostic Sensitivity Simulation
We conducted Monte Carlo Sensitivity Analyses for misclassification 3 to probe the potential impact of changes in sensitivity of the CMS diagnostic algorithm for ADRD, following a recent paper of similar study design 4 . These analyses simulated relative risk regression models with the same sample size and event rate observed in the data, and used a relative risk estimate of .9 for an exposure with a standard normal distribution, corresponding the covariate-adjusted relative risk calm and maturity at +1 SD SES.
Non-differential misclassification would be expected to produce little change in point estimates, but to increase standard errors. For this simulation, we drew sensitivity estimates from a uniform distribution varying + / -5% around the published sensitivity estimate of .86. As in prior work 4 , specificity was held fixed to prevent epidemiologically implausible changes in the rare event rate i.e., a 3% dementia prevalence at age 70 exploding to 30% with 10% variations in specificity. 1000 simulations were run for this condition, and eFigure 2 shows the results.
As can be seen from eFigure 2, the impact of non-differential misclassification is primarily to increase the standard error of the estimate. For the purposes of inference, at the lowest examined sensitivity .81, the highest standard error of .022 is observed, yielding a z-statistic of -.1054/.022 = -.4.79, yielding a p-value below .001.
Next, we examined the potential impact of differential misclassification according to levels of calm or maturity in high school. There is no definitive way to know what form this sort of differential sensitivity would take, since several scenarios can be envisioned. For instance, students who were less calm in high school may be more likely to experience health anxiety or other problems causing them to come to the attention of the medical system, where a dementia might be diagnosed. The opposite could occur if people who experienced health anxiety were more likely to avoid the medical system as a form of coping. Similarly, more mature high school students might exercise greater responsibility in pursuing regular health care later in life, where a dementia could be detected. Alternatively, a high level of responsibility might lead them into caregiving roles for others, causing them to neglect their own health care needs in favor of those of their charges.
To encompass this range of possibilities, we examined the impact of a sensitivity drawn from a uniform distribution lower by up to 5% in those in the bottom half of a trait (i.e., sensitivities of .81-.86) while simultaneously higher by the same amount in those in the top half the trait (i.e., sensitivities of .86. -.91). Thus, up to a 10% range in specificity was again assessed, but this time it varied systematically according to the trait.
Results indicated that if sensitivity is assumed to be systematically higher in those scoring high on a protective trait such as calm and maturity, the protective association is overestimated. This is illustrated in the top portion of eFigure 3 which shows the log relative risk attenuating as sensitivity is assumed to be increasingly greater among those high vs. low in a protective trait. At the maximum discrepancy of 10%-a sensitivity of .81 among those low in calm and .91 among those high in calm, for instance-the protective association is diminished to a log relative risk of .94.
The bottom portion of eFigure 3 shows the impact on the relative risk is the opposite type of misclassification that is assumed to occur-that is, sensitivity is higher among those higher in calm or maturity. In this case, the observed relative risk of .9 is an underestimate of the true relative risk. At a maximum of 10% difference-that is, a sensitivity of .91 among those low in calm, and .81 among those high in calm, for instance-the true protective association is actually greater, with a relative risk of approximately .87. In both scenarios standard error is not systematically affected. Thus, the simulations results of differential sensitivity by levels of high school personality reveal that, depending on the assumptions, the observed relative risks of roughly .9 for calm and maturity at higher SES may be either underestimates or overestimates. The direction depends on whether one believes persons who were calm and mature as adolescents in 1960 are more or less likely to seek care leading to ADRD diagnoses in the index period. eFigure 3. Impact of Differential Misclassification on Log Relative Risks eFigure 3 caption: Results of 1000 simulations of the impact of sensitivity varying systematically by level of personality trait on log relative risk corresponding to that seen for calm and maturity at +1 SD SES and the standard error of that log relative risk. The top portion shows results from the condition in which sensitivity is higher among those high in the trait, while the bottom portion shows the results from the condition in which sensitivity is higher among those low in the trait

3C: High School Personality Traits and Service Use in the Index Period
To further study this question, we examined whether any 1960 traits were associated with the total amount of fee-for-service outpatient visits across 2012-2013 (the years for which claims data are available for this sample). These included outpatient visits to general practitioners, geriatricians, family medicine physicians as well as specialists liable to render such diagnoses (psychiatrists and neurologists). Because the hypothetical process involves utilizing services and receiving a dementia diagnosis as a result, this analysis focused on those who had not yet received a diagnosis. Service use would likely be elevated after a diagnosis as a result of the diagnosis, rather than high school personality traits. Number of visits was modeled as a negative binomial outcome, to account for clustering within high utilizers, and employed the same set of covariates as the primary models for dementia. ETable 4 presents the results.