Box plot of the distribution of absolute CSF-Aβ42 levels across the 3 genotypes observed at rs541458 in PICALM in the combined case-control data set from Germany. Horizontal lines represent median values, boxes are 25% to 75% ranges, and whiskers extend to 1 × the interquartile range; values outside this range are depicted as circles. CSF-Aβ42 levels decrease with increasing numbers of T alleles (which are associated with risk of Alzheimer disease). Using log-normalized Aβ42 levels, we found that this effect is significant at P = .002 after adjusting for age, sex, apolipoprotein E (APOE) ε4 dose, and diagnostic group. CSF indicates cerebrospinal fluid. A similar, albeit more pronounced and more significant, effect is also observed for increasing numbers of the ε4 allele at the APOE locus (eFigure 4).
Schjeide BM, Schnack C, Lambert J, Lill CM, Kirchheiner J, Tumani H, Otto M, Tanzi RE, Lehrach H, Amouyel P, von Arnim CAF, Bertram L. The Role of Clusterin, Complement Receptor 1, and Phosphatidylinositol Binding Clathrin Assembly Protein in Alzheimer Disease Risk and Cerebrospinal Fluid Biomarker Levels. Arch Gen Psychiatry. 2011;68(2):207-213. doi:10.1001/archgenpsychiatry.2010.196
Two recent and simultaneously published genome-wide association studies independently implicated clusterin (CLU), complement receptor 1 (CR1), and phosphatidylinositol binding clathrin assembly protein (PICALM) as putative novel Alzheimer disease (AD) risk loci. Despite their strong statistical support, all 3 signals emerged from heterogeneous case-control populations and lack replication in different settings.
To determine whether genetic variants in CLU, CR1, and PICALM confer risk for AD in independent data sets (n = 4254) and to test the impact of these markers on cerebrospinal fluid (CSF)–Aβ42 and total-tau protein levels (n = 425).
Genetic association study using family-based and case-control designs.
Ambulatory or hospitalized care.
Family samples originate from mostly multiplex pedigrees recruited at different centers in the United States (1245 families, 2654 individuals with AD, and 1175 unaffected relatives). Unrelated case-control subjects originate from 1 clinical center in Germany (214 individuals with AD and 211 controls). All subjects were of European descent.
Main Outcome Measures
The association between 5 genetic variants in CLU, CR1, and PICALM and risk for AD, and the correlation between these 5 genetic variants and CSF-Aβ42 and tau levels.
All 3 investigated loci showed significant associations between risk for AD (1-tailed P values ranging from <.001 to .02) and consistent effect sizes and direction. For each locus, the overall evidence of association was substantially strengthened on meta-analysis of all available data (2-tailed P values ranging from 1.1 × 10−16 to 4.1 × 10−7). Of all markers tested, only rs541458 in PICALM was shown to have an effect on CSF protein levels, suggesting that the AD risk allele is associated with decreased CSF Aβ42 levels (2-tailed P = .002).
This study provides compelling independent evidence that genetic variants in CLU, CR1, and PICALM are genetically associated with risk for AD. Furthermore, the CSF biomarker analyses provide a first insight into the potentially predominant pathogenetic mechanism(s) underlying the association between AD risk and PICALM.
A large proportion of susceptibility to non-Mendelian (sporadic or late-onset) forms of Alzheimer disease (AD) is likely determined by the contribution of common genetic risk factors that exert their effects with low penetrance.1 Besides a well-established and highly significant association between risk for AD and genetic variants in the apolipoprotein E (APOE) gene on chromosome 19q, the exact number and nature of the remaining susceptibility loci remain elusive.2 More than 600 candidate loci have been tested in mostly hypothesis-driven genetic association studies in nearly 1500 publications, but only a few of these showed significant effects in systematic meta-analyses3 (for up-to-date results, see the AlzGene database at http://www.alzgene.org maintained by our group the Alzheimer Research Forum). High-throughput genotyping technologies now enable researchers to perform genome-wide analyses in an unbiased and largely hypothesis-free fashion, using sets of densely spaced single-nucleotide polymorphisms (SNPs). Within the past 3 years,13 such genome-wide association studies (GWASs) have been published in the field of AD, highlighting more than 30 novel potential susceptibility loci with essentially no overlap in results across studies with the exception of APOE.4 The 2 most recent GWASs implicated SNPs in clusterin (CLU), complement receptor 1 (CR1), and phosphatidylinositol binding clathrin assembly protein (PICALM) as novel putative AD risk loci.5,6 Although the association with CLU independently reached genome-wide significance in both GWASs, the genetic markers in CR1 and PICALM were each initially identified in only 1 of the 2 studies but were subsequently replicated at subgenome-wide significance in the other. In systematic meta-analyses of all genetic association data available on AD, these associations are among the highest ranking findings of the entire field. Notwithstanding their relatively strong statistical support, all 3 signals emerged from heterogeneous multicenter case-control studies, which still lack replication by independent groups, especially in samples ascertained from families with AD. The purpose of our study was to assess the role of CLU, CR1, and PICALM in AD risk in a collection of more than 4250 white subjects originating from multiplex families with AD ascertained in the United States and in unrelated AD cases and controls recruited in Germany. In addition, all 3 loci were tested for their potential link to cerebrospinal fluid (CSF) levels of 2 established AD biomarkers in the German data set (n = 425). Finally, all results were combined with the previously published evidence via random-effects meta-analysis and were classified on the basis of their epidemiological credibility.
All samples were collected with informed written consent and appropriate ethical approval at the collection sites. Demographic details of the various data sets can be found in eTable 1). The family-based samples were ascertained in the United States and originate from 4 different projects aimed at the study of genetic factors in AD: the Consortium on Alzheimer's Genetics sample,7 the National Institute of Mental Health sample,8 and the National Institute on Aging sample and the National Cell Repository for Alzheimer's Disease sample (both at http://www.ncrad.org). With the exception of the Consortium on Alzheimer's Genetics sample, the majority of pedigrees analyzed herein were nuclear families ascertained on the basis of multiple affected individuals, generally lacking parental genotypes, because parents were usually deceased at the time of proband recruitment. In addition to containing at least 1 affected relative pair, many pedigrees also had DNA available from additional affected or unaffected individuals (mostly siblings). The diagnosis of definite, probable, or possible AD was made according to criteria from the National Institute of Neurological and Communicative Diseases and Stroke and the Alzheimer's Disease and Related Disorders Association for affected individuals in all 4 samples.9 Only families of self-reported white ancestry in which no affected individual showed an onset at age younger than 50 years were included.
The case-control data set was obtained from individuals recruited between 1999 and 2008 at the Memory Clinic of the Neurology University Hospital in Ulm, Germany. Alzheimer disease was diagnosed according to the National Institute of Neurological and Communicative Diseases and Stroke–Alzheimer's Disease and Related Disorders Association criteria9 and the DSM-IV. Only patients fulfilling diagnostic criteria for probable AD were included. Unrelated control subjects were recruited at the same site and did not display any cognitive or neurological deficits following thorough clinical (including magnetic resonance imaging or computed tomography) and neuropsychological examination. Although the presence of population admixture within this data set could not be assessed owing to a lack of sufficient genotype information, it appears unlikely to be an issue given that both cases and controls were collected in a relatively homogenous area from southern Germany.
All genetic analyses were performed at the Max Planck Institute for Molecular Genetics in Berlin, Germany, and were performed blind to the clinical status of the subjects. Genotyping of SNPs was performed in 384-well format using TaqMan chemistry (Applied Biosystems, Carlsbad, California) according to manufacturer's instructions. SNP rs541458 in PICALM was genotyped using a Singleplex TaqMan genotyping assay (Applied Biosystems, Carlsbad, California), whereas the other 4 SNPs (rs11136000, rs2279590, and rs9331888 in CLU and rs6656401 in CR1) were genotyped in parallel on the OpenArray (Applied Biosystems) multiplex genotyping system. Overall, genotyping efficiency was greater than 98%, with an error rate of less than 0.2% (based on more than 1500 genotypes generated across HapMap samples run in multiples on each genotyping plate). APOE genotypes in the German case-control cohort were determined by direct sequencing on an ABI3730XL genetic analyzer (Applied Biosystems) or from blood samples via isoelectric focusing. None of the markers violated the Hardy-Weinberg equilibrium at P ≤ .05 in the control samples.
Cerebrospinal fluid biomarker data were available only for the German case-control sample. The CSF sample collection and preanalytic processing were performed at the Neurology University Hospital in Ulm, Germany, using a standardized protocol as previously described elsewhere.10 In brief, CSF samples were placed into polypropylene tubes following lumbar puncture, centrifuged immediately after collection, and stored within 2 hours at −80°C in Eppendorf tubes until analysis. CSF-Aβ42 and total tau protein levels were determined using commercially available sandwich enzyme-linked immunosorbent assays Innotest β-amyloid[1-42] and hTau Ag kits; Innogenetics, Gent, Belgium following the manufacturers instructions as previously described elsewhere.11,12 Monitoring the diagnostic accuracy of the CSF tests was done according to international guidelines.13
Genetic association analyses using affection status as a dichotomous trait were performed assuming additive transmission models using FBAT version 2.0.3 (Harvard School of Public Health, Cambridge, Massachusetts; http://www.biostat.harvard.edu/~fbat/default.html) for the family-based sample (with an equal-weight offset correction [ie, identical weights of opposite sign are assigned to affected and unaffected individuals resulting in a statistic that contrasts transmissions to affected vs unaffected individuals], applying the empirical variance estimation function to account for the presence of multiple affected individuals per nuclear family)14 and PLINK version 1.07 (http://pngu.mgh.harvard.edu/~purcell/plink/) for the case-control sample. To increase power, genotypes for all 4 family-based data sets were pooled before analysis. Odds ratios (ORs) and confidence intervals (CIs) in the case-control sample were calculated using PLINK, whereas ORs and CIs in the family-based sample were determined by conditional logistic regression stratified by family using SAS version 9.2 (SAS Institute, Cary, North Carolina) as described previously elsewhere.15 Combined evidence of association across the family-based and case-control data sets was determined using random effects via the rmeta package version 2.16 in R version 2.10.0 (R Foundation for Statistical Computing, Vienna, Austria). Violations in the Hardy-Weinberg equilibrium in unaffected individuals (determined for all German controls and separately for a collection of unaffected individuals from the US family samples [1 per family, where available]) were determined using PLINK. Because the hypothesis of this part of our study was to specifically probe for the previously reported allele-specific effects resulting in increased (rs6656401 and rs9331888) or decreased (rs11136000, rs2279590, and rs541458) ORs in carriers of the minor vs major alleles at these sites, statistical significance is expressed as 1-tailed P values and 90% CIs for all the outlined analyses.
Genetic association analyses using CSF biomarker levels as quantitative traits were calculated using PLINK via linear regression, including age, sex, and APOE ε4 dose (coded as 0,1,2) as covariates. To approximate a normal distribution, both CSF-Aβ42 and total-tau concentrations were log-transformed (base 10) before analysis. The quantitative trait analyses were then performed for all individuals combined (including affection status as a covariate) as well as for AD cases and healthy controls separately. Meta-analyses combining the genotype data generated in our study with data from previously published association studies on the same SNPs5,6,16- 18 were based on random-effect models19 using crude ORs and standard errors calculated for each data set. For the assessment of the epidemiologic credibility of these meta-analyses, we also performed sensitivity analyses after exclusion of data sets in which control subjects had genotypes that violated the Hardy-Weinberg equilibrium and after exclusion of the initial study. Between-study heterogeneity was quantified using the I2 metric. Evidence for reporting bias was assessed using a modified regression test.20 Statistical significance for all meta-analyses was calculated as 2-tailed P values and 95% CIs. Power analysis for the case-control sample was performed using the Genetic Power Calculator (http://pngu.mgh.harvard.edu/~purcell/gpc/); power analysis for the family-based samples was estimated using PBAT version 3.6 (http://www.biostat.harvard.edu/~clange/default.htm). Both analyses were based on a disease prevalence of 0.1 and a 1-sided α level of .05, which were based on allelic ORs reported for each SNP in the AlzGene database. Power calculations of the meta-analyses combining our data with previously generated data were based on a 2-sided genome-wide α level of 5 × 10−8.
All meta-analysis results were graded on the basis of the Human Genome Epidemiology Network interim criteria for the assessment of cumulative evidence of genetic associations, referred to as the Venice criteria.21,22 These criteria take into account the amount of evidence (ie, sample size as measured by the number of minor alleles across both cases and controls), the consistency of replication (ie, heterogeneity across studies measured by the I2 statistic), and protection from bias (ie, via the modified regression test and sensitivity analyses, excluding the initial study and samples showing Hardy-Weinberg equilibrium violations at P ≤ .05 in controls). For more details about how these criteria are applied, see Ioannidis et al,21 Khoury et al,22 Allen et al,23 and http://www.alzgene.org/methods.asp.
Association analyses of the family-based samples revealed nominal association (based on 1-tailed P values) with 3 of the 5 tested SNPs (one each in CLU, CR1, PICALM ; Table 1); a fourth SNP in CLU showed a statistical trend in favor of association (rs11136000; 1-tailed P = .06). Effect sizes and effect directions were consistent with those reported in the original GWASs for each of these associations (Table 1, eTable 2, and eFigures 1, 2, and 3). In the case-control sample, nominally significant associations (based on 1-tailed P values) were observed for 2 of the 5 tested SNPs (one each in CR1 and PICALM ; Table 1). Here, too, the underlying effects were very consistent with both the family-based findings and the previous GWAS results.5,6 The only gene not showing evidence for association in the case-control sample was CLU (best SNP rs9331888; 1-tailed P ~ .12), which also showed the weakest support for association in the family-based samples. However, assuming an allelic OR of approximately 1.15, the power to detect a nominal, 1-tailed P value of .05 in the case-control sample alone was only approximately 30% (compared with ~60% in the family sample and ~70% in the combined sample), suggesting that the lack of significance for CLU in the case-control data set could be due to a lack of power. Combining the results from the family-based and case-control data sets of our study via random-effects meta-analysis increased the statistical support for all 3 loci (Table 1), with CR1 showing the most pronounced and most significant effects (OR, 1.33; 1-tailed P < .001) and CLU showing the smallest change in AD risk (OR, 0.88; 1-tailed P ~ .04). Finally, meta-analyses of our data and the data from the previous studies yielded highly significant associations for all 3 loci. In these analyses, the rs11136000 variant in CLU showed by far the strongest support for association (OR, 0.86; 2-tailed P = 1.1 × 10−16), followed by PICALM (OR, 0.87; 2-tailed P = 2.3 × 10−11) and CR1 (OR, 1.21; 2-tailed P = 4.1 × 10−7). Application of the Human Genome Epidemiology Network criteria for the cumulative assessment of genetic association studies assigned “strong” epidemiologic credibility to the CLU and PICALM findings (1 SNP each) and “moderate” credibility to the CR1 finding. The latter can be attributed to heterogeneity across the GWAS follow-up data by Lambert et al,6 in particular the stage 2 case-control sample from Italy, which shows a slightly opposite direction of effect when compared with the rest of the samples from all of the studies, resulting in an I2 of 36% (eFigure 2). Removal of that 1 outlying sample considerably strengthened the overall evidence in favor of CR1 (OR, 1.25; 2-tailed P = 4.1 × 10−14; strong epidemiological credibility with an I2 of 0%).
Next, we investigated whether or not any of the 5 investigated SNPs in CLU, CR1, and PICALM showed evidence for association with levels of CSF-Aβ42 and total tau, 2 well-established biomarkers for AD.24 All analyses were performed in the combined case-control sample, as well as in AD cases and controls separately. As can be seen in Table 2, the PICALM T allele (associated with AD risk) showed a significant association with decreased CSF-Aβ42 levels (2-tailed P = .002; Figure 1). The decrease in CSF-Aβ42 levels was dependent on dose, with the strongest effect in homozygous carriers of the T allele with an approximately 20% decrease in absolute Aβ42 concentration when compared with homozygous carriers of the C allele. Although the direction of this association is consistent with that on the same trait conferred by the APOE ε4 allele, it is less pronounced and much less significant (~50% decrease in ε4/4 vs ε3/3; 2-tailed P = 7.2 × 10−16; Table 2 and eFigure 4). Furthermore, as has been described in previous studies (eg, Kester et al25), the APOE effect on CSF-Aβ42 levels was most evident in unaffected individuals (2-tailed P = 2.4 × 10−11), whereas the association for PICALM was strongest in the AD case group (2-tailed P = .01). For CSF-Aβ42 levels, the only other marker that approached statistical significance was rs6656401 in CR1 (2-tailed P = .08, in controls only). Owing to the only weak statistical support and heterogeneity of the effect across genotypes (Table 2), this likely represents a chance finding. Finally, apart from APOE (2-tailed P ~ .003), no association was observed between any of the tested GWAS SNPs and levels of CSF total tau (Table 2).
Even in the GWAS era, independent replication remains the primary means of distinguishing true-positive vs false-positive genetic association findings.26,27 Herein, we provide compelling evidence that all 3 of the recently proposed novel AD candidate loci, CLU, CR1, and PICALM, show association with risk for AD in a study of more than 4250 subjects from the United States and Germany. For the lead signals in all 3 genes, the effect direction and effect sizes estimated were remarkably consistent with what was originally reported. Intriguingly, 90% of our sample was from mostly multiplex AD families and analyzed using family-based methods, an approach that is generally believed to be less prone to bias due to undetected population admixture.28 Thus, the convergence of independent case-control and family-based findings lends further support to the notion that CLU, CR1, and PICALM, indeed, represent genuine AD susceptibility factors. In addition to the risk effects, we report novel evidence suggesting a link between CSF-Aβ42 levels and allele dose at the PICALM rs541458 SNP, which, to the best of our knowledge, was not described in any previous study. If validated in independent cohorts, this finding could provide a first clue regarding the predominant pathogenetic mechanism underlying the association between AD and PICALM.
Although functional data are still lacking to elucidate the precise mechanism by which SNPs in or near PICALM could impact levels of Aβ in the brain and CSF of risk-allele carriers, it is tempting to speculate that dysfunction of the PICALM protein could be connected to amyloid precursor protein processing via endocytic pathways.29 This hypothesis was already outlined by Harold et al5 in their original GWAS and is based on the notion that the PICALM protein is involved in clathrin-mediated endocytosis,30 the inhibition of which can lead to a reduction in amyloid precursor protein internalization and Aβ production.31 Therefore, it is conceivable that sequence variants in or close to the PICALM gene could impact this process, either directly or via changes in synaptic activity. However, these hypotheses are largely speculative and still need to be addressed in specific molecular genetic and biochemical experiments.
Although our study significantly strengthens the overall evidence implying that all 3 recently proposed GWAS signals represent genuine AD susceptibility loci, a number of questions remain unanswered. First, the genetic markers tested in our study (and in the primary GWAS) are very likely not the functional genetic variants. In fact, none of the 5 variants maps compellingly close to, or is in significant linkage disequilibrium with, any obviously functional variant in these regions.5,6 Thus, despite the nearly unequivocal evidence suggesting that the genomic regions near the investigated SNPs likely contain functionally active variant(s) relevant in AD pathogenesis, more work is needed to further pinpoint the location and characterize the role of the pathophysiologically active elements. Second, although these genes appear to exert their effects across the majority of the white populations investigated to date, other ancestral backgrounds need to be studied with high priority to arrive at a better understanding of the role that these genes play in other populations. Third, although our CSF biomarker analyses suggest that the AD association with
PICALM is functionally more likely related to amyloid precursor protein–Aβ metabolism than to tau dysfunction, it cannot be excluded that this association is actually the result of another, correlated effect or that it represents a false-positive finding altogether (although the association between PICALM and CSF-Aβ42 levels remains nominally significant even after Bonferroni correction for multiple testing [20 tests] with a 2-tailed P ~ .047).
Likewise, our failure to observe significant changes in CSF-Aβ42 levels with variants in the other 2 genes and the lack of significant effects on CSF total-tau levels of all 3 genes either imply that these effects do not exist (at least not with the markers studied) or that our sample lacked power to detect them. However, in our case-control cohort, assuming a 2-sided α level of 5% showed that we had more than 80% power to detect additive genetic effects that explain down to 2% of the total variance of Aβ42 and total-tau CSF levels. Therefore, unless the underlying genetic effects leading to changes in the levels of these biomarkers are even smaller, lack of power is likely not an issue. Finally, despite the overall strong statistical and epidemiological support for the associations between AD risk and CLU, CR1, and PICALM, the relevance of these findings on a population-wide level remains to be determined. Similar to what is found in recent GWASs for other complex diseases,26,27 the observed risk effects are relatively small; that is, they confer changes in disease risk between approximately 15% and 20% per allele. As a result, the associations described by us and others will likely have no major role serving as a diagnostic or predictive tool for AD in a clinical setting, unless other (eg, rarer) variants of higher penetrance are linked to the observed effects. Nevertheless, depending on their precise functional role in AD pathogenesis, the highlighted loci might still be essential in advancing our understanding of the biochemical processes leading to AD and in developing appropriate and effective means to target these processes therapeutically.
In summary, we provide compelling evidence that genetic variants in CLU, CR1, and PICALM are genetically associated with risk for AD. The independent convergence of case-control GWASs and family-based follow-up data substantially strengthens the notion that these genes (likely in concert with numerous other loci) exert genuine disease-modifying effects. The results from our CSF biomarker analyses suggest that the predominant pathogenetic mechanism(s) underlying the association between AD risk and PICALM warrants examination in future functional genetic studies.
Correspondence: Lars Bertram, MD, Neuropsychiatric Genetics Group, Department Vertebrate Genomics, Max Planck Institute for Molecular Genetics, Ihnestrasse 63, Rm 204.1, 14195 Berlin, Germany (email@example.com).
Submitted for Publication: May 4, 2010; final revision received August 9, 2010; accepted October 5, 2010.
Author Contributions: Ms Schjeide and Dr Bertram had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Financial Disclosure: None reported.
Funding/Support: The project was funded by grants from the Institut Pasteur Lille, the Fondation Plan Alzheimer, the Cure Alzheimer's Fund (Dr Bertram), and the German Federal Ministry of Research and Education (Dr Bertram). Dr von Arnim was supported by the WIN-Kolleg of the Heidelberg Academy of Sciences and Humanities. Dr Otto was supported by Anteprion, cNeupro, NeuoTAS, and Landesstiftung Baden-Würtemberg.
Role of the Sponsor: The funding sources had no role in the design and conduct of the study; in the collection, analysis, and interpretation of the data; or in the preparation, review, or approval of the manuscript.
Additional Information: Samples from the National Cell Repository for Alzheimer's Disease, which receives government support under a cooperative agreement grant (U24 AG21886) awarded by the National Institute on Aging, were used in this study. Since the submission of our manuscript, a number of studies have been published that, like our study, independently confirm the association between AD risk and genetic polymorphisms in CLU, CR1, and PICALM. Please see http://www.alzgene.org for up-to-date results.
Additional Contributions: We thank all patients and other participants across the various samples for their contribution to this study. We also thank Mrs Sabina Gualazzini for her help with data collection and management of the Ulm samples. We thank the contributors, including the Alzheimer's Disease Centers for their collection of samples that were used in this study, as well as patients and their families, whose help and participation made this work possible.
This article was corrected for typographical errors on February 7, 2011.