AD indicates Alzheimer disease; APOE, apolipoprotein E; CHARGE, Cohorts for Heart and Aging Research in Genomic Epidemiology; Fundació ACE, Fundació Alzheimer Centre Educacional; GWAS, genome-wide association study; SNP, single-nucleotide polymorphism.
Each data marker represents the statistical significance (P value) of each single-nucleotide polymorphism (SNP) plotted on the − log10 scale against its chromosomal position (NCBI Build 36). The blue diamonds show stage 1 P values for the sentinel (top) SNP at each locus, whereas the gray and black diamonds show the P values for the same SNP following stage 2 and stage 3 meta-analyses, respectively. The fine-scale recombination rate is shown by the blue line, which shows the average frequency with which recombination occurs at that site. Genes located in the region shown are shown as green arrows with HUGO gene nomenclature committee gene symbols. The length of the green arrow represents the size/extent of the gene and the arrowhead the direction in which transcription of mRNA occurs.
Seshadri S, Fitzpatrick AL, Ikram MA, DeStefano AL, Gudnason V, Boada M, Bis JC, Smith AV, Carrasquillo MM, Lambert JC, Harold D, Schrijvers EMC, Ramirez-Lorca R, Debette S, Longstreth WT, Janssens ACJW, Pankratz VS, Dartigues JF, Hollingworth P, Aspelund T, Hernandez I, Beiser A, Kuller LH, Koudstaal PJ, Dickson DW, Tzourio C, Abraham R, Antunez C, Du Y, Rotter JI, Aulchenko YS, Harris TB, Petersen RC, Berr C, Owen MJ, Lopez-Arrieta J, Vardarajan BN, Becker JT, Rivadeneira F, Nalls MA, Graff-Radford NR, Campion D, Auerbach S, Rice K, Hofman A, Jonsson PV, Schmidt H, Lathrop M, Mosley TH, Au R, Psaty BM, Uitterlinden AG, Farrer LA, Lumley T, Ruiz A, Williams J, Amouyel P, Younkin SG, Wolf PA, Launer LJ, Lopez OL, van Duijn CM, Breteler MMB, CHARGE, GERAD1, and EADI1 Consortia FT. Genome-wide Analysis of Genetic Loci Associated With Alzheimer Disease. JAMA. 2010;303(18):1832-1840. doi:10.1001/jama.2010.574
Author Affiliations: Departments of Neurology (Drs Seshadri, DeStefano, Beiser, Du, Auerbach, Au, Farrer, and Wolf) and Medicine (Genetics Program) (Mr Vardarajan and Dr Farrer), Boston University School of Medicine, and Departments of Biostatistics (Drs DeStefano, Beiser, Du, and Farrer) and Epidemiology (Dr Farrer), Boston University School of Public Health, Boston, Massachusetts; National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, Massachusetts (Drs Seshadri, DeStefano, Debette, Beiser, Auerbach, Au, and Wolf); Departments of Epidemiology (Drs Fitzpatrick, Longstreth, and Psaty), Global Health (Dr Fitzpatrick), Medicine (Drs Bis, Longstreth, and Psaty), Neurology (Dr Longstreth), Biostatistics (Drs Rice and Lumley), and Health Services (Dr Psaty), University of Washington, and Center for Health Studies, Group Health (Drs Fitzpatrick, Longstreth, and Psaty), Seattle; Departments of Epidemiology (Drs Ikram, Schrijvers, Janssens, Aulchenko, Hofman, van Duijn, and Breteler), Neurology (Dr Koudstaal), Internal Medicine (Drs Rivadeneira and Uitterlinden), and Clinical Chemistry (Dr Uitterlinden), Erasmus MC University Medical Center, Rotterdam, the Netherlands; Netherlands Consortium for Healthy Aging, Leiden (Drs Ikram, Schrijvers, Janssens, Aulchenko, Rivadeneira, Hofman, Uitterlinden, van Duijn, and Breteler); Icelandic Heart Association, Kopavogur (Drs Gudnason, Smith, and Jonsson); University of Iceland (Drs Gudnason, Aspelund, and Jonsson) and Landspitali University Hospital (Dr Jonsson), Reykjavik; Memory Clinic of Fundació ACE Institut Català de Neurociències Aplicades (Drs Boada and Hernandez) and Department of Neurology (Dr Boada), Hospital G. Universitari Vall d’Hebron, Barcelona, Spain; Department of Neuroscience, Mayo Clinic College of Medicine, Jacksonville, Florida (Drs Carrasquillo, Dickson, Graff-Radford, and Younkin); Institut Pasteur de Lille and Université de Lille Nord de France (Dr Lambert) and Inserm U744 and Centre Hospitalier Régional Universitaire de Lille (Dr Amouyel), Lille; Medical Research Council Centre for Neuropsychiatric Genetics and Genomics, Department of Psychological Medicine and Neurology, School of Medicine, Cardiff University, Cardiff, Wales (Drs Harold, Hollingworth, Abraham, Owen, and Williams); Department of Structural Genomics, Neocodex, Sevilla, Spain (Drs Ramirez-Lorca and Ruiz); Division of Biomedical Statistics and Informatics (Dr Pankratz), Mayo Clinic and Mayo Foundation, Department of Neurology (Drs Petersen and Graff-Radford), and Mayo Alzheimer Disease Research Center (Dr Petersen), Mayo Clinic College of Medicine, Rochester, Minnesota; Inserm U897, Victor Segalen University, Bordeaux, France (Dr Dartigues); Departments of Epidemiology (Dr Kuller), Neurology and Psychiatry (Drs Becker and Lopez), and Psychology (Dr Becker), and Alzheimer's Disease Research Center (Drs Becker and Lopez), University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania; Inserm U708 and Université Pierre et Marie Curie Paris 6 (Dr Tzourio), Centre National de Génotypage, Institut Genomique, Commissariat à l’énergie Atomique, Evry (Dr Lathrop), and Fondation Jean Dausset–Centre d’Etudes du Polymorphisme Humain (Dr Lathrop), Paris, France; Dementia Unit, University Hospital Virgen de la Arrixaca, and Alzheimer Foundation, Murcia, Spain (Dr Antunez); Medical Genetics Institute, Cedars-Sinai Medical Center, Los Angeles, California (Dr Rotter); Laboratory of Epidemiology, Demography, and Biometry (Drs Harris and Launer) and Laboratory of Neurogenetics (Dr Nalls), Intramural Research Program, National Institute on Aging, Washington, DC; Inserm U888, Hôpital La Colombière, Montpellier, France (Dr Berr); Memory Unit, University Hospital La Paz-Cantoblanco, Madrid, Spain (Dr Lopez-Arrieta); Inserm U614, Faculté de Médecine-Pharmacie de Rouen, Rouen, France (Dr Campion); Institute of Molecular Biology and Biochemistry and University Clinic of Neurology, Department of Neurogeriatrics, Medical University Graz, Graz, Austria (Dr Schmidt); and Department of Medicine-Geriatrics, University of Mississippi Medical Center, Jackson (Dr Mosley).
Context Genome-wide association studies (GWAS) have recently identified CLU, PICALM, and CR1 as novel genes for late-onset Alzheimer disease (AD).
Objectives To identify and strengthen additional loci associated with AD and confirm these in an independent sample and to examine the contribution of recently identified genes to AD risk prediction in a 3-stage analysis of new and previously published GWAS on more than 35 000 persons (8371 AD cases).
Design, Setting, and Participants In stage 1, we identified strong genetic associations (P < 10−3) in a sample of 3006 AD cases and 14 642 controls by combining new data from the population-based Cohorts for Heart and Aging Research in Genomic Epidemiology consortium (1367 AD cases [973 incident]) with previously reported results from the Translational Genomics Research Institute and the Mayo AD GWAS. We identified 2708 single-nucleotide polymorphisms (SNPs) with P<10−3. In stage 2, we pooled results for these SNPs with the European AD Initiative (2032 cases and 5328 controls) to identify 38 SNPs (10 loci) with P<10−5. In stage 3, we combined data for these 10 loci with data from the Genetic and Environmental Risk in AD consortium (3333 cases and 6995 controls) to identify 4 SNPs with P<1.7×10−8. These 4 SNPs were replicated in an independent Spanish sample (1140 AD cases and 1209 controls). Genome-wide association analyses were completed in 2007-2008 and the meta-analyses and replication in 2009.
Main Outcome Measure Presence of Alzheimer disease.
Results Two loci were identified to have genome-wide significance for the first time: rs744373 near BIN1 (odds ratio [OR],1.13; 95% confidence interval [CI],1.06-1.21 per copy of the minor allele; P = 1.59×10−11) and rs597668 near EXOC3L2/BLOC1S3/MARK4 (OR, 1.18; 95% CI, 1.07-1.29; P = 6.45×10−9). Associations of these 2 loci plus the previously identified loci CLU and PICALM with AD were confirmed in the Spanish sample (P < .05). However, although CLU and PICALM were confirmed to be associated with AD in this independent sample, they did not improve the ability of a model that included age, sex, and APOE to predict incident AD (improvement in area under the receiver operating characteristic curve from 0.847 to 0.849 in the Rotterdam Study and 0.702 to 0.705 in the Cardiovascular Health Study).
Conclusions Two genetic loci for AD were found for the first time to reach genome-wide statistical significance. These findings were replicated in an independent population. Two recently reported associations were also confirmed. These loci did not improve AD risk prediction. While not clinically useful, they may implicate biological pathways useful for future research.
One of every 5 persons aged 65 years is predicted to develop Alzheimer disease (AD) in their lifetime, and genetic variants may play an important part in the development of the disease.1 The apparent substantial heritability of late-onset AD2 is inadequately explained by genetic variation within the well-replicated genes (apolipoprotein E [APOE; RefSeq NG_007084], presenilin-1 [PSEN1; RefSeq NG_007386], presenilin-2 [PSEN2; RefSeq NG_007381], and amyloid beta precursor protein [APP; RefSeq NM_000484]).3 Initial genome-wide association studies (GWAS) identified putative new candidate genes (GRB2-associated binding protein [GAB2; RefSeq NG_016171], protocadherin 11 x-linked [PCDH11X; RefSeq NG_016251], lecithin retinol acyltransferase [LRAT; RefSeq NG_009110], and transient receptor potential cation channel, subfamily C, member 4– associated protein [TRPC4AP; RefSeq NM_015638 ])4- 6 and regions of interest (eg, on chromosomes 14q, 10q, and 12q),7- 10 but no locus outside of the APOE region consistently reached genome-wide significance.4,11,12 These disappointing results are most likely explained by the modest sample size and, hence, limited statistical power of early studies to detect genes with small effects. Recently, 2 large GWAS, the United Kingdom–led Genetic and Environmental Risk in Alzheimer Disease 1 consortium (GERAD1)13 and the European Alzheimer Disease Initiative stage 1 (EADI1),14 reported 3 new genome-wide significant loci for AD: within the CLU gene (GenBank AY341244) encoding clusterin (also called apolipoprotein J), near the PICALM gene (GenBank BC073961) encoding phosphatidylinositol–binding clathrin assembly protein, and within the CR1 (RefSeq NG_007481) gene encoding complement component (3b/4b) receptor 1.13,14
We performed a 3-stage analysis of GWAS data to identify additional loci associated with late-onset AD. Moreover, we sought to replicate genome-wide significant loci, from both the current analysis and previous reports, in an independent case-control population. Finally, we used 2 large, prospective, population-based studies to assess the improvement in incident AD risk prediction conferred by the recently described loci.
We used a 3-stage sequential analysis to identify novel loci associated with late-onset AD (Figure 1). Our initial investigation (stage 1) was a meta-analysis combining new genome-wide association data from white participants in the large, population-based Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium15 with GWAS data from the Translational Genomics Research Institute (TGEN) public release database4 and the Mayo AD GWAS.5 The sample characteristics of the participants contributing to this discovery stage are summarized in Table 1. In stage 2, we combined results for our most suggestive findings (single-nucleotide polymorphisms [SNPs] with P<10−3) with corresponding results in the EADI1 consortium.14 In stage 3, we combined results for the most promising hits in stage 2 (selecting top SNPs from all loci that reached P<10−5) with data from the nonoverlapping studies within the GERAD1 consortium (excluding the Mayo AD GWAS, the only overlapping study).13 All participants (or their authorized proxies) in the contributing studies gave written informed consent including for genetic analyses. Local institutional review boards approved study protocols. Details of study sample selection for the contributing studies are described in section 2 of the eAppendix (section 1 lists abbreviations used in this article) and in eFigure 1, parts A through D.
In each study, dementia was defined using the Diagnostic and Statistical Manual of Mental Disorders, Third Edition Revised or Fourth Edition (DSM-IV) criteria.16 Among persons with dementia, all studies used the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer Disease and Related Disorders Association (NINCDS-ADRDA) criteria to define AD and included persons with definite (diagnosis of AD pathologically confirmed at autopsy), probable, or possible AD.17
The individual studies in stage 1 were genotyped on different platforms, as shown in Table 1. The EADI1 used the Illumina Quad 6.0 (Illumina Inc, San Diego, California), and GERAD1 was genotyped on various Illumina chips. In each of the CHARGE cohorts and in the TGEN database, we used the genotype data to impute to the 2.5 million nonmonomorphic autosomal SNPs described in HapMap (Utah residents with Northern and Western European ancestry from the Centre d’Etude du Polymorphisme Humain collection [CEU population]). Imputations are needed to meta-analyze genome-wide association data across studies that have used different genotyping platforms because the platforms differ in the SNPs genotyped. Imputation methods and quality control filters in each sample are described in section 3 of the eAppendix.
All analyses were restricted to white persons, racial identity being self-defined by the participants (see section 2 of the eAppendix). We included white Hispanics and adjusted for population structure. Since only 1 of the CHARGE studies, the Cardiovascular Health Study (CHS), had a small number of African American participants (n = 574 with genotyping), this racial subgroup was too small for independent analysis. Linkage disequilibrium patterns are very different in individuals of African heritage, which leads to greater uncertainty in imputation, as well as the possibility of false-positive associations if data from 2 racial groups are combined when disease risk differs by race (referred to as population stratification); hence, African American participants in the CHS were excluded from these analyses. Each study fit an additive genetic model (a 1-degree-of-freedom trend test) relating genotype dosage (0, 1, or 2 copies of the minor allele) to study trait. In the CHARGE cohorts, prevalent cases were compared with controls free of dementia at the DNA draw date. Participants were excluded if they declined consent or genotyping failed. For analysis of prevalent events in the CHARGE cohorts and for the case-control data from the TGEN and Mayo cohorts, we used logistic regression models. For the analysis of incident events in the CHARGE cohorts, participants who were free of dementia entered the analysis at the time of the DNA sample collection and were followed up until development of incident AD; participants were censored at death, at the time of their last follow-up examination or health status update when they were known to be free of clinical dementia, and when they developed dementia from a cause other than AD. We used Cox proportional hazards models to calculate hazard ratios with corresponding 95% confidence intervals (CIs) after ensuring that assumptions of proportionality of hazards were met. In the CHS, Framingham Heart Study (FHS), and Rotterdam Study, controls contributed one set of person-years to the prevalent analysis and a second, nonoverlapping set of person-years to the incident analyses. Under the Martingale property of Cox models, the 2 analyses are independent, and their independence was confirmed in simulation studies. Primary analyses were adjusted for age and sex and any evidence of population stratification. Details of the screening for latent population substructure in each discovery sample are available in section 4 of the eAppendix. In addition, the CHS also adjusted for study site and the FHS accounted for familial relationships (by using a Cox model with robust variance estimator clustering on pedigree to account for family relationships) and for whether the DNA had been whole-genome amplified.
Our stage 1 meta-analysis combined results from 9 discrete sources: incident AD in the CHS, FHS, and Rotterdam Study; prevalent AD in the Age, Gene/Environment Susceptibility–Reykjavik Study, CHS, FHS, and Rotterdam Study; and the TGEN and Mayo AD GWAS case-control studies. We used inverse-variance weighting (also known as a fixed-effects analysis) for meta-analysis, applying genomic control to each study of stage 1. This approach assigns greater weight to more precise (study-specific) estimators; thus, greater weight is given to studies in which a given SNP was genotyped or more effectively imputed and to studies with larger sample sizes. Details of the meta-analyses are available in section 5 of the eAppendix. We retained only SNP-phenotype associations that were based on results from at least 2 of the 9 discovery samples and for which the minor allele frequency was at least 2%. For stages 2 and 3, we again used inverse-variance meta-analysis but without genomic control adjustment. We decided a priori on a genome-wide significance threshold of P < 1.7×10−8, which gives, for a 3-stage sequential analysis, the same control of false-positives as a single study's use of P < 5×10−8.18 The 3 stages of meta-analyses were completed in May to August 2009.
Significant hits from stage 3 of the discovery phase were replicated in an independent Spanish case-control sample (the Fundació Alzheimer Centre Educacional [ACE]) of 1140 patients with AD (mean age, 78.8 [SD, 7.9] years; 69.9% women) compared with 1209 general population controls (mean age, 49.9 [SD, 9.2] years; 52.8% women).19,20 All AD patients fulfilled DSM-IV criteria for dementia and NINCDS-ADRDA criteria for possible and probable AD.16,17 Both cases and controls were white. Details of the sample are provided in section 6 of the eAppendix. Genotyping was undertaken using real-time polymerase chain reaction coupled with fluorescence resonance energy transfer. Effect sizes for single markers were calculated by unconditional logistic regression analysis using SPSS software, version 13.0 (SPSS Inc, Chicago, Illinois). Replication was completed in October 2009.
In secondary analyses, we also examined results for previously reported loci.5,13,14 For these loci, which included the recently reported loci by the EADI1 and GERAD1 consortia, we restricted our analysis to the previously unpublished CHARGE data. We did not assess the association with PCDH11X because we focused only on autosomal SNPs in these analyses. We did examine associations with the top 15 candidate genes listed in the Alzgene database (http://www.alzgene.org)21 as of August 12, 2009, including the APOE/TOMM40/APOC1 locus and 12 genes outside that locus. Details of SNPs selected and results for these SNPs are provided in section 7 of the eAppendix and in eTable 1.
We sought to estimate the effect of recently identified loci on 10-year risk prediction in the general population using the data for prospectively ascertained incident AD in the 2 largest community-based cohort studies available to us (Rotterdam Study and CHS). In these analyses, we only included SNPs from the 2 loci that were shown to have genome-wide significance in previous publications and that we replicated nominally within CHARGE: PICALM and CLU (P<.05). Moreover, the analysis was restricted to incident AD to avoid survival bias and was restricted to population-based samples because case-control studies may overestimate the effects of the genes if cases and controls were not randomly selected from the populations in which AD risk prediction is to be applied.22 The improvement in risk prediction was investigated by comparing 3 sequentially incremental AD risk prediction models that first incorporated age and sex alone, then added data on risk allele status at the APOE locus and, finally, data on risk allele status at the CLU and PICALM loci. We did not assess the utility of novel loci uncovered in this article (using CHARGE as part of the discovery sample) to avoid the risk of overestimating effects by using the same sample for gene discovery and risk prediction.22 Prediction models were constructed using Cox proportional hazards methods using the R package survcomp. APOE ε4 status was included as a discrete variable (0, 1, or 2 alleles) and the other 2 genetic loci as dosages; all gene effects were examined using additive models. The accuracy of risk prediction for each model was assessed as discriminative accuracy, measured by the area under the receiver operating characteristic curve (AUC). The AUC theoretically ranges from 0.50 (as predictive as tossing a coin) to 1.00 (perfect prediction).
The stage 1 meta-analysis had 8935 dementia-free individuals (mean age, 72 [SD, 7] years), of whom 973 developed incident AD over an average follow-up time of 8 [SD, 3] years, and 2033 prevalent cases of AD who were compared with 14 642 dementia-free controls. Of these, 1367 AD cases (973 incident) were from the CHARGE cohorts. In this discovery analysis based on the CHARGE cohorts, TGEN, and Mayo GWAS, there was no evidence of spurious inflation of P values or significant population stratification (see eFigure 2 for the quantile-quantile plot comparing the observed and expected P value distributions). eFigure 3 illustrates the primary findings from the stage 1 meta-analysis in a Manhattan plot showing genome-wide P values for all interrogated SNPs across the 22 autosomal chromosomes. After stage 1, 2708 SNPs had a P<10−3 and were studied in stage 2. In stage 2, pooling these results with data from EADI1 (2032 cases and 5328 controls), 38 SNPs in 10 loci had a P<10−5. Finally, in stage 3, the most significant SNPs from these 10 loci were meta-analyzed with the nonoverlapping studies from GERAD1 (3333 cases and 6995 controls). The findings of the stages 1, 2, and 3 analyses at these 10 loci are presented in Table 2. Additional details are provided in eTable 2, which shows chromosomal location, adjacent genes, sample- and stage-specific estimates of relative risks, 95% CIs, and P values for each of the 38 SNPs selected in stage 2 analyses. Figure 2 and Figure 3 show regional association plots for the 2 SNPs not previously reported to have reached genome-wide significance, rs744373 and rs597668 on chromosomes 2 and 19, respectively. In these figures, we show the linkage disequilibrium (with the index SNP) and stage 1, 2, and 3 association results for the index SNP and stage 1 results for all SNPs within 200 kilobase (kb) on either side of the index SNP at that locus, as well as gene locations and recombination rates in the region. Regional association plots for the other loci listed in Table 2 are presented in eFigure 4, eFigure 5, eFigure 6, eFigure 7, and eFigure 8.
In stage 1, 11 SNPs in the APOE/TOMM40/APOC1 region reached our preset threshold for genome-wide significance (eTable 2 and eFigure 3). In stage 2, 2 additional loci, rs11136000 in CLU and a locus (rs11771145) at chromosome 7 in the 5′ upstream promoter/regulatory region of ephrin receptor A1 (EPHA1; GenBank AH007960) reached genome-wide significance. However, the latter became nonsignificant after adding GERAD1 data in stage 3, though the effect seen in GERAD1 was in the same direction in that the same allele was associated with an increased (but nonsignificant) risk of AD. In stage 3, genome-wide significant evidence for association with AD was reached at the APOE (rs2075650; P = 1.04×10−295), CLU (rs11136000; P = 1.62×10−16), and PICALM (rs3851179; P = 3.16×10−12) loci, as well as for 2 novel loci on chromosomes 2 (rs744373; P = 1.59×10−11) and 19 (rs597668; P = 6.45×10−9). Table 2 shows the odds ratios (ORs) associated with the minor allele for each of these SNPs. Locus rs744373 is within 30 kb of the gene bridging integrator 1 (BIN1; RefSeq NG_012042) (Figure 2), while rs597668 is within 60 kb of 6 genes including exocyst complex component 3–like 2 (EXOC3L2; RefSeq NM_138568), biogenesis of lysosomal organelles complex 1, subunit 3 (BLOC1S3; RefSeq NG_008372), and microtubule-associated protein/microtubule affinity-regulating kinase 4 (MARK4; GenBank BC071948) (Figure 3).
We replicated the 4 associations that reached our preset genome-wide significance threshold (1.7×10−8) in an independent sample of cases and controls (Table 3). Effect sizes in the replication cohort were similar to those observed in the discovery sample; each of these associations reached P<.05.
Because rs597668 is on chromosome 19, fairly close to the APOE locus, we undertook conditional analyses to examine whether its association with AD was independent of APOE ε4. We conducted 2 analyses with AD (among persons with directly genotyped APOE ε4 status) in the CHARGE, TGEN, and Mayo samples, adjusting (1) for our strongest association in the APOE/TOMM40/APOC1 locus (rs2075650) and (2) for the actual APOE ε4 SNP, rs429358. In each case, we found that the association was attenuated but a marginal signal remained when adjusting for APOE ε4 (OR, 1.18; 95% CI, 1.08-1.24; P = 3.9×10−4 without adjustment; OR, 1.17; 95% CI,1.07-1.23; P = 8.7×10−4 for analysis 1; and OR, 1.10; 95% CI, 1.00-1.16; P = .05 for analysis 2). We also examined the effect of adjusting for age, sex, and presence of at least 1 APOE ε4 allele (using a dominant genetic inheritance model) in the Spanish replication sample. Again, the results were attenuated (OR, 1.24; 95% CI, 1.02-1.51; P = .03). These findings are consistent with the moderate to low level of linkage disequilibrium observed between rs597668 and SNPs within the APOE and TOMM40 region (r2<0.01 according to HapMap CEU data) (Figure 3).
In our secondary analyses examining replication of published findings in the previously unreported CHARGE data, 6 intronic or 3′ untranslated region SNPs in the APOE/TOMM40/APOC1 region (rs6857, rs2075650, rs4420638, rs157582, rs6859, and rs10119) reached a genome-wide significance threshold of P<1.7×10−8, and we replicated the most statistically significant SNPs within 2 of the 3 recently reported genetic loci associated with AD in prior GWAS: CLU (rs11136000; OR, 0.90; 95% CI, 0.82-0.98; P = .02) and PICALM (rs3851179; OR, 0.90; 95% CI, 0.83-0.99; P = .02) (eTable 2 and eAppendix). We did not find a significant association with the most statistically significant CR1 SNP (rs3818361) in the CHARGE data. However, 13 SNPs within the gene showed nominal significance (P>.001 but P< .05) (eTable 3). Furthermore, adding CHARGE and TGEN data on rs3818361 to the previously reported EADI1 and GERAD1 data (Mayo AD GWAS data were included in the GERAD1 data for this analysis) showed that results now reached genome-wide significance (OR, 1.15; 95% CI, 1.11-1.20; P = 1.04×10−11) (eFigure 9).
Among the 54 SNPs selected from the top 12 candidate genes (outside the APOE/TOMM40/APOC1 locus) listed in the Alzgene Web site, we found evidence for a nominal association of rs4362 in the angiotensin-converting enzyme (ACE; RefSeq NC_000017.10) gene and rs1784933 in the sortilin-related receptor L (DLR class A) repeats-containing (SORL1; RefSeq NC_000011.9) gene with AD (relative risks associated with each copy of the minor allele were 0.92; 95% CI, 0.85-0.99; P = .03 for ACE and 1.33; 95% CI, 1.03-1.72; P = .03 for SORL1) (eTable 1).
We assessed the extent to which APOE ε4, PICALM, and CLU can improve predictive models for risk of incident AD in the general population (represented by the cohorts of the Rotterdam Study and CHS). The addition of APOE ε4 carrier status to a prediction model including age and sex only increased the AUC from 0.826 (95% CI, 0.806-0.846) to 0.847 (95% CI, 0.828-0.865) in the Rotterdam Study and from 0.670 (95% CI, 0.625-0.723) to 0.702 (95% CI, 0.654-0.754) in the CHS. Further inclusion of risk allele status for CLU and PICALM improved the AUC only minimally to 0.849 (95% CI, 0.831-0.867) in the Rotterdam Study and to 0.705 (95% CI, 0.654-0.751) in the CHS. The corresponding receiver operating characteristic curves are shown in eFigure 10.
We report results of an international 3-stage genome-wide analysis to study genetic variation underlying late-onset, sporadic AD. We studied more than 35 000 persons (8371 AD cases), constituting the largest sample analyzed to date. In the gene discovery phase, we showed genome-wide significance for 2 loci related to AD, one on chromosome 2 and a second locus on chromosome 19, that had not previously been found to achieve genome-wide significance and that appear to be independent of APOE. BIN1 was previously identified as having a possible association with AD in the recent GWAS from the GERAD1,13 but until our analysis, this association was not significant at the genome-wide level. Furthermore, we replicated both these loci as well as the recently identified loci, CLU and PICALM, in an independent sample. Although genetic variation at the CLU and PICALM loci did modify the risk of AD in our population-based sample, these polymorphisms added very little to prediction of AD risk.
The locus on chromosome 2q14.3 is adjacent to the BIN1 gene, which is 1 of 2 amphiphysins and is expressed most abundantly in the brain and muscle.23 Amphiphysins promote caspase-independent apoptosis and also play a critical role in neuronal membrane organization and clathrin-mediated, synaptic vessel formation,24 a process disrupted by β-amyloid.25 Knockout mice with decreased expression of the amphiphysins have seizures and major learning deficits.26 Altered expression of BIN1 has been demonstrated in aging mice, in transgenic mouse models of AD, and in persons with schizophrenia.27,28
The 19q13.3 locus (rs597668), a site distal to and not in linkage disequilibrium with SNPs in the APOE locus, had been suspected in an early linkage study to harbor a gene for AD.29 There are 6 genes adjacent to this locus, 2 of which are part of pathways linked to AD pathology. The protein product of BLOC1S3, biogenesis of lysosomal organelles complex 1, subunit 3, is expressed in the brain, regulates endosomal to lysosomal routing,30 and has been implicated in schizophrenia.31 The second gene, MARK4, is inducible, expressed only in the brain, and plays a role in neuronal differentiation.32 MARK4 is a kinase that phosphorylates tau, is polyubiquitinated in vivo, and is a substrate of the aging-related deubiquitinating enzyme USP9X; hence, it may play a role in the abnormal tau phosphorylation seen in AD.33 Little is known of the function of the gene closest to rs597668, EXOC3L2, also referred to as protein 7 transactivated by hepatitis B virus X antigen (XTP7) gene.
When evaluating the added value of the new AD genes in clinical risk prediction, we focused on the 2 recently reported AD genes13,14 that were replicated in our population-based studies, CLU and PICALM, and found that they only slightly improved prediction of incident AD beyond age, sex, and APOE ε4 based models. The increase in AUC was 0.002 in the Rotterdam Study and 0.003 in the CHS, which would not be of value in the clinical setting. There are 2 reasons for this. First, the associations of CLU and PICALM with AD risk were markedly lower than those of age and APOE; therefore, a major improvement was not expected. This fits with recent insights on polygenic models that assume there are tens of thousands of risk alleles, each with a small (approximately 5% increase in relative risk) effect throughout the whole genome, rather than a discrete number of alleles with moderate effects. Such models appear to underlie the susceptibility to schizophrenia risk, and a similar model may be applicable to AD.34 Second, the extent to which risk factors improve risk prediction depends on the predictive performance of the initial risk model. Added risk factors need to have stronger effects to improve a risk model with high AUC than to improve a model with lower AUC. Alzheimer disease risk prediction based on age, sex, and APOE already has very high discriminative accuracy: the AUC was 0.826 in the Rotterdam Study and 0.670 in the CHS, which implies that further improvements require many new variants or variants with strong effects. Whether such improvements are to be expected will depend in large part on the ability to unravel the underlying genetic architecture and to identify and quantify environmental risk factors, including complex interactions.35 A next step for genetic research in AD will be to further increase the sample sizes of GWAS and evaluate further genetic models.
Strengths of this study include the large sample of clinic- and community-based cases and controls and the subsample of prospectively ascertained incident AD that permitted the exploration of incident risk prediction algorithms. Alzheimer disease was diagnosed using standard NINCDS-ADRDA criteria. The observed associations are unlikely to be due to population stratification since the discovery and replication samples were restricted to white individuals of European origin and were also investigated for latent population substructure.
The study also has limitations. Despite our large sample size, we had limited power to detect associations with small effect sizes and associations with rare variants. Although all studies used accepted clinical or pathological criteria to define dementia and AD, phenotypic heterogeneity between samples may have limited our ability to detect some associations. Moreover, the controls in the Spanish replication sample were younger than the cases, and their cognitive status had not been formally examined. However, whereas this could reduce our power to observe an association, it would not invalidate the associations we did observe. Furthermore, the frequency distribution of minor and major alleles among the Spanish controls was similar to that noted in the discovery sample and in the HapMap CEU sample.
In conclusion, this meta-analysis of GWAS data from several of the largest AD GWAS studies to date confirms previously known and recently described associations (CLU and PICALM) and shows genome-wide significance and replication for 2 biologically plausible, novel loci on chromosomes 2 and 19. However, the predictive ability of CLU and PICALM to identify individuals at risk of AD is not clinically significant. The value of these associations may lie in the insights they could provide for research into the pathophysiological mechanisms of AD.
Corresponding Author: Monique M. B. Breteler, MD, PhD, Department of Epidemiology Erasmus MC, University Medical Center Rotterdam, PO Box 2040, 3000 CA Rotterdam, the Netherlands (firstname.lastname@example.org).
Author Contributions: Drs Seshadri, Ikram, DeStefano, and Breteler had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Drs Seshadri, Fitzpatrick, Ikram, DeStefano, Gudnason, and Boada contributed equally as first authors. Drs Ruiz, Williams, Amouyel, Younkin, Wolf, Launer, Lopez, van Duijn, and Breteler contributed equally as last authors.
Study concept and design: Seshadri, Bis, Rotter, Hofman, Schmidt, Mosley, Lumley, Amouyel, Wolf, Lopez, Duijn, Breteler.
Acquisition of data: Seshadri, Ikram, Gudnason, Boada, Carrasquillo, Lambert, Schrijvers, Ramirez-Lorca, Janssens, Pankratz, Dartigues, Hernandez, Beiser, Kuller, Koudstaal, Dickson, Tzourio, Antunez, Rotter, Berr, Owen, Lopez-Arrieta, Becker, Rivadeneira, Nalls, Graff-Radford, Campion, Auerbach, Jonsson, Lathrop, Au, Psaty, Uitterlinden, Farrer, Ruiz, Williams, Amouyel, Younkin, Lopez, Breteler.
Analysis and interpretation of data: Seshadri, Fitzpatrick, Ikram, DeStefano, Bis, Smith, Carrasquillo, Harold, Debette, Longstreth, Janssens, Hollingworth, Aspelund, Beiser, Abraham, Du, Rotter, Aulchenko, Harris, Petersen, Owen, Vardarajan, Rivadeneira, Nalls, Rice, Lathrop, Mosley, Farrer, Lumley, Ruiz, Williams, Amouyel, Younkin, Launer, Lopez, Duijn, Breteler.
Drafting of the manuscript: Seshadri, Fitzpatrick, Ikram, DeStefano, Bis, Ramirez-Lorca, Du, Vardarajan, Lopez, Duijn, Breteler.
Critical revision of the manuscript for important intellectual content: Seshadri, Ikram, Gudnason, Boada, Bis, Smith,
Carrasquillo, Lambert, Harold, Schrijvers, Ramirez-Lorca, Debette, Longstreth, Janssens, Pankratz, Dartigues, Hollingworth, Aspelund, Hernandez, Beiser, Kuller, Koudstaal, Dickson, Tzourio, Abraham, Antunez, Rotter, Aulchenko, Harris, Petersen, Berr, Owen, Lopez-Arrieta, Becker, Rivadeneira, Nalls, Graff-Radford, Campion, Auerbach, Rice, Hofman, Jonsson, Schmidt, Lathrop, Mosley, Au, Psaty, Uitterlinden, Farrer, Lumley, Ruiz, Williams, Amouyel, Younkin, Wolf, Launer, Lopez, Breteler.
Statistical analysis: Ikram, DeStefano, Bis, Smith, Harold, Schrijvers, Janssens, Hollingworth, Aspelund, Beiser, Abraham, Du, Aulchenko, Vardarajan, Rivadeneira, Nalls, Rice, Lathrop, Farrer, Lumley, Ruiz, Duijn.
Obtained funding: Seshadri, Gudnason, Boada, Dartigues, Kuller, Antunez, Rotter, Harris, Petersen, Becker, Jonsson, Schmidt, Uitterlinden, Ruiz, Younkin, Launer, Lopez, Breteler.
Administrative, technical, or material support: Seshadri, Fitzpatrick, Bis, Carrasquillo, Ramirez-Lorca, Pankratz, Hernandez, Kuller, Koudstaal, Antunez, Rotter, Becker, Nalls, Campion, Auerbach, Mosley, Psaty, Uitterlinden, Farrer, Wolf.
Study supervision: Seshadri, Gudnason, Boada, Koudstaal, Tzourio, Antunez, Rotter, Berr, Owen, Hofman, Lathrop, Uitterlinden, Williams, Amouyel, Duijn, Breteler.
Financial Disclosures: None reported.
Funding/Support: The funding/support for this study is described in the eAppendix.
Role of the Sponsor: The funding organizations and sponsors had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review or approval of the manuscript. The final version submitted was approved without changes by the National Heart, Lung, and Blood Institute and the National Institute on Aging.
Additional Contributions: Additional contributions are listed in the eAppendix.
This article was corrected online for error in data on 5/19/2010, prior to publication of the correction in print.This article was corrected online for error in data on 6/8/2010, prior to publication of the correction in print.