Blood Leukocyte Counts in Alzheimer Disease

Key Points Question What is the relevance of types of blood leukocytes in Alzheimer disease (AD)? Findings In this cohort study of 101 582 individuals in the Danish general population, low baseline blood monocyte counts were associated with increased AD risk. In a mendelian randomization framework using the most powerful genomic data sets, genetically determined low monocyte counts were also associated with increased AD risk, an association independent of other types of blood leukocytes. Meaning The findings of this study suggest that the observational and genetic association observed between low monocyte counts and increased risk of AD highlights a possible role of the innate immune system in AD pathogenesis.

specific proportional hazards models were fitted within each imputed dataset and were subsequently pooled according to Rubin's rules to estimate hazard ratios (HR) and corresponding 95% confidence intervals (CI). The proportional hazard assumption was visually assessed by log(-log[survival]) versus log(analysis time) plots and was tested using Schoenfeld residuals. No major violations were observed. We fitted two multivariable-adjusted regression models, and in all models, age was used as the time scale (referred to as age adjustment in the following text), where subjects enter the analysis at their baseline age (lefttruncation) and exit at their event/censoring/death age. Model 1 was adjusted for age and sex; model 2 was additionally adjusted for education, APOE genotype, BMI, smoking status, alcohol consumption, physical activity, hypertension, and type 2 diabetes mellitus. If only individuals with all data available (N = 98,318) were analyzed, results were similar to those reported.
Leukocyte counts were categorized into 5 groups by percentiles, i.e. <5 th , 5-25 th , 25-75 th (reference group), 75-95 th , and >95 th , to facilitate exploration of extreme leukocyte counts on risk of AD. The corresponding absolute leukocyte count for each cutoff point is shown in eTable 3 in the supplement. Moreover, on a linear scale, we generated restricted cubic splines (based on model 2) to illustrate the possible nonlinear dose-response relationship of different leukocyte counts with AD. Three knots located at the 5 th , 50 th , and 95 th percentiles were selected in line with the categorical groups for an adequate fit of the model and to avoid overfitting.

Mendelian randomization
Mendelian randomization (MR) uses genetic variants associated with the exposure of interest to investigate the causal effect of risk factors on outcomes. Genetic variants inherited from parents to offspring at conception are randomly assorted and segregated at meiosis and are used as proxies of the exposure levels. As a result, individuals are divided into two comparable groups. Those who carry the effect allele (e.g. with an increased level of exposure) are assigned to the group with higher levels of the exposure of interest whereas those who carry the alternative allele are assigned to the group with lower levels of exposure. Hence, no confounding factors or reverse causation exists through the random allocation of genetic variants. Therefore, the difference in outcomes between genetically defined groups can be directly attributed to the exposure. Consequently, the MR design is considered as a "natural experiment", mimicking randomized clinical trials, where individuals are randomized to carry a genetic variant. To obtain reliable estimates of the effect of an exposure on an outcome from MR studies, genetic variants should meet three principal requirements, namely that they 1) are associated with the exposure; 2) are not related to any observed or unobserved confounding factors; 3) are associated with the outcome exclusively through its effect on exposure.

Genetic instruments for types of leukocyte cell counts
Leukocyte counts from the Blood Cell Consortium (BCX) was generated by impedance-based electronic cell counters; for lymphocytes, monocytes, neutrophils, basophils, and eosinophils, cell counts were the relative counts derived from the total leukocyte count multiplied by the proportion for each cell type. Genetic associations were performed using a linear mixed-effects model to account for relatedness in each cohort with the additive genetic model. Covariates included in the regression models were age, agesquared, sex, principal components, and study-specific factors such as study center. Single-nucleotide polymorphisms (SNPs) at a genome-wide significant level (p-value < 5×10 -8 ) were initially selected as potential genetic instrumental variables. To minimize possible pleiotropic bias, we excluded SNPs that were associated with more than one type of leukocyte cell count. In addition, linkage disequilibrium (LD) between SNPs for the same exposure was assessed in the European 1000 Genome Project reference panel. When LD was present (r 2 >0·001), the variant with the smallest P-value was retained. The total variation explained by the instrumental variables was calculated based on the retrieved summary statistics for each leukocyte count. We considered an F-statistic, calculated as (beta/se) 2 , of above 10 to be a sufficiently strong instrument for reliable statistical analyses.

Non-linear MR
We calculated genetic risk scores (GRS) for each participant weighted by the associations of the genetic instrumental variables for different types of leukocyte counts identified from the previous step. For nonlinear MR analyses, we first calculated instrument-free exposure, i.e., the residual variation of the continuous different types of leukocyte counts regressing on their GRS. Subsequently, we stratified on residual exposure to avoid overadjustment bias and collider bias 1 . We divided residual exposure into 20 equal groups by 5 percentiles (except for basophil and eosinophil due to the same value of some 4 © 2022 Luo J et al. JAMA Network Open. percentiles), therefore, comparison could be made between individuals in the population who would have cell counts in the same stratum if they had the same genotype. GRS-AD associations were calculated within each stratum using logistic regression adjusted for age, sex, genotyping batch, and the first 10 principal components. GRS-cell counts associations are estimated using linear regression models among the whole population considering this association is constant, meaning that the association is similar across different strata. In the strata of the instrument-free exposure, the local average causal effect (LACE) of different types of leukocyte counts on AD is estimated by dividing GRS-AD associations by GRS-cell counts associations. We used the nlmr R package for the non-linear MR analyses (https://github.com/jrs95/nlmr) 2 . We adopted piecewise linear method in which a linear regression is fitted within each stratum of the instrument-free exposure. Confidence intervals are estimated through bootstrapping.

Data sources or Alzheimer disease The European Alzheimer's & Dementia Biobank (EADB)
EADB brings together a range of European GWAS consortia, and summary estimates were based on 39,106 clinically diagnosed AD cases, 46,828 proxy-AD cases and 401,577 controls 3 . The diagnosis of cases varies from study to study but was generally diagnosed according to standard criteria including the Diagnostic and Statistical Manual of Mental Disorders (DSM III-R, IV) and the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer's disease and Related Disorders Association (NINCDS-ADRDA) criteria, or based on the ICD-10 research criteria in registrations or health records. A more detailed description of diagnosis criteria of each participating study is available in the supplementary note of the GWAS 3 . Proxy-AD cases only presented in the UK Biobank, where proxy-AD designation was based on questionnaire data asking if parents had AD. Genetic associations were adjusted for age, sex, principal components and the number of APOE ε4 and ε2 alleles. The mean age of AD onset in cases among all included cohorts were between 63.2 and 83.7 years, and the mean age at examination of controls were between 44.9 to 82.8 years.

The International Genomics of Alzheimer's Project (IGAP)
To rule out the possible influence of proxy AD cases, we additionally used data from IGAP. IGAP metaanalyzed GWA studies on individuals of European ancestry comprising 21,982 cases and 41,944 cognitively normal controls from four consortia 4 . AD cases in individual studies were either confirmed through autopsies or based on a clinical diagnosis from health records in the participating studies. Genetic associations were adjusted for age, defined as age-at-onset for cases and age-at-last exam for controls, sex, and principal components. In the four cohorts, the mean age at onset of AD in cases ranged from 71.1 to 82.6 years, and the mean age at examination of controls ranged from 51.0 to 78.9 years.

Univariable Mendelian randomization analyses
We harmonized instruments-exposure and instruments-outcome data to ensure that effect estimates were aligned on the same effect allele, and palindromic genetic variants were eliminated. For the primary analysis, we used the inverse-variance weighted (IVW) method to combine SNP-specific estimates calculated by Wald ratios. IVW assumes no directional pleiotropic effect of each instrumental variable and constrains intercepts to zero. In addition, we performed several sensitivity analyses including weighted median estimator, MR-Egger regression, and Mendelian Randomization Pleiotropy RESidual Sum and Outlier (MR-PRESSO) [5][6][7] . Weighted median estimator can provide a reliable estimate if more than half of the instrumental variables are valid. In MR-Egger, the regression slope represents the causal effect of an exposure on an outcome, and the freely estimated intercept additionally provides an average magnitude of the pleiotropic effects across all genetic variants if it deviates from zero. MR-PRESSO was applied to detect and correct for horizontal pleiotropy through outlier removal in the instrumental variables, subsequently giving an unbiased causal estimate 7  In this study, we used both observational (upper panel) and genetic studies (lower panel) to evaluate the associations between different types of leukocyte counts and risk of Alzheimer disease (AD). In observational studies, confounding factors that influence both the exposure and outcome may bias the true association between the exposure and the outcome; additionally, the observed association between the exposure and the outcome might be due to reverse causation, ordinarily referring to the situation in which the outcome precedes the exposure instead of the other way around. Therefore, the associations identified in observational studies cannot suggest causality. Genetic studies were performed by using Mendelian randomization (MR) approaches. The difference between MR and conventional observational studies is the use of genetic variants. Briefly, genetic variants associated with types of leukocyte counts at genome-wide significant level from the genome-wide association studies were exploited as instrumental variables for types of leukocyte counts. These genetic variants should not associate with any confounding factors and should associate with AD only through different types of leukocyte counts. Therefore, MR studies are free of confounding or reverse causation. The potential causal effect of different types of leukocyte counts on AD could be estimated by dividing gene-AD associations by gene-cell counts associations.