Subject quality control. AD indicates Alzheimer disease; UK, United Kingdom; MRC, Medical Research Council; and SNP, single-nucleotide polymorphism.
Linkage disequilibrium (LD) graphical displays for the Alzheimer disease–associated loci identified by logistic regression (A and B) and Cox proportional hazards regression (C), including RS7019241 and RS10868366 (A), RS9886784 (B), and RS10519262 (C). The x-axis indicates chromosomal position. Transcriptional units with gene names are marked by the lines with arrows above the x-axis. The –log10 adjusted minimum P value from the logistic regression (A and B) or Cox proportional hazards regression (C) is plotted for each single-nucleotide polymorphism (SNP). Symbols of the same shape and color denote SNPs that are in the same LD cluster (r2 > 0.7). Open black circles indicate SNPs not correlated with any other. If SNPs have different symbols, then correlation between them is small, or limited at most. Below the association plots is an LD matrix with increasing LD denoted by a darker color. White areas indicate little to no LD, whereas black areas indicate strong to complete LD. The most significant SNP in the Canadian collection is the largest symbol; *P value for the corresponding SNP in the United Kingdom Medical Research Council data set. bp indicates base pairs; kb, kilobases.
Kaplan-Meier curves of Alzheimer disease (AD) onset by age and single-nucleotide polymorphism genotype for RS10519262 in the Canadian (A) and United Kingdom Medical Research Council (B) data sets. CI indicates confidence interval.
Li H, Wetten S, Li L, St. Jean PL, Upmanyu R, Surh L, Hosford D, Barnes MR, Briley JD, Borrie M, Coletta N, Delisle R, Dhalla D, Ehm MG, Feldman HH, Fornazzari L, Gauthier S, Goodgame N, Guzman D, Hammond S, Hollingworth P, Hsiung G, Johnson J, Kelly DD, Keren R, Kertesz A, King KS, Lovestone S, Loy-English I, Matthews PM, Owen MJ, Plumpton M, Pryse-Phillips W, Prinjha RK, Richardson JC, Saunders A, Slater AJ, St. George-Hyslop PH, Stinnett SW, Swartz JE, Taylor RL, Wherrett J, Williams J, Yarnall DP, Gibson RA, Irizarry MC, Middleton LT, Roses AD. Candidate Single-Nucleotide Polymorphisms From a Genomewide Association Study of Alzheimer Disease. Arch Neurol. 2008;65(1):45-53. doi:10.1001/archneurol.2007.3
Copyright 2008 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2008
To identify single-nucleotide polymorphisms (SNPs) associated with risk and age at onset of Alzheimer disease (AD) in a genomewide association study of 469 438 SNPs.
Case-control study with replication.
Memory referral clinics in Canada and the United Kingdom.
The hypothesis-generating data set consisted of 753 individuals with AD by National Institute of Neurological and Communicative Diseases and Stroke/Alzheimer's Disease and Related Disorders Association criteria recruited from 9 memory referral clinics in Canada and 736 ethnically matched control subjects; control subjects were recruited from nonbiological relatives, friends, or spouses of the patients and did not exhibit cognitive impairment by history or cognitive testing. The follow-up data set consisted of 418 AD cases and 249 nondemented control cases from the United Kingdom Medical Research Council Genetic Resource for Late-Onset AD recruited from clinics at Cardiff University, Cardiff, Wales, and King's College London, London, England.
Main Outcome Measures
Odds ratios and 95% confidence intervals for association of SNPs with AD by logistic regression adjusted for age, sex, education, study site, and French Canadian ancestry (for the Canadian data set). Hazard ratios and 95% confidence intervals from Cox proportional hazards regression for age at onset with similar covariate adjustments.
Unadjusted, SNP RS4420638 within APOC1 was strongly associated with AD due entirely to linkage disequilibrium with APOE. In the multivariable adjusted analyses, 3 SNPs within the top 120 by P value in the logistic analysis and 1 in the Cox analysis of the Canadian data set provided additional evidence for association at P < .05 within the United Kingdom Medical Research Council data set: RS7019241 (GOLPH2), RS10868366 (GOLPH2), RS9886784 (chromosome 9), and RS10519262 (intergenic between ATP8B4 and SLC27A2).
Our genomewide association analysis again identified the APOE linkage disequilibrium region as the strongest genetic risk factor for AD. This could be a consequence of the coevolution of more than 1 susceptibility allele, such as APOC1, in this region. We also provide new evidence for additional candidate genetic risk factors for AD that can be tested in further studies.Published online November 12, 2007 (doi:10.1001/archneurol.2007.3).
Alzheimer disease (AD), the most prevalent form of dementia, is a neurodegenerative disorder that affects more than 5% of individuals aged 65 years and older.1 The discovery of the association between the apolipoprotein E (APOE [Entrez GeneID NM_000041) ε4 allele and AD confirmed the prominent role of specific genetic polymorphisms as risk factors for late-onset AD.2,3 Besides APOE, a meta-analysis of AD genetic association studies as of February 1, 2007, reported 29 potential AD susceptibility genes (http://www.alzgene.org).4 Additional independent data are needed to further test these associations and to provide evidence for genes and gene interactions (especially with APOE) that contribute to the development and progression of AD.
We performed a genomewide association analysis in a case-control study of 753 patients with AD and 736 nondemented control individuals with 3 main aims. The first was to obtain a ranking of single-nucleotide polymorphisms (SNPs) based on strength of statistical association with AD diagnosis or age at onset for novel target-specific drug development and validation and to prioritize SNPs for pharmacogenetics studies.5 Second, we used this as a hypothesis-generating data set; 120 SNPs with the numerically lowest multivariable P values for genetic association with AD were evaluated in a second data set to assess generalizability and to reduce false-positive associations.6 Finally, we hereby release on the GlaxoSmithKline Clinical Trials Registry (http://www.GSK.com) the P values and allele and genotype frequencies from all of the SNPs to serve as a publicly available resource for further studies.
The study protocol was reviewed and approved by the appropriate ethics committee or investigational review board for each study site before patients were recruited. Informed consent was obtained from study participants in accordance with all applicable investigational review board, ethics committee, and regulatory requirements.
The primary Canadian data set was drawn from 875 patients with AD and 850 nondemented control subjects recruited from 9 memory referral clinics in Canada between June 4, 2002, and March 30, 2005. Seven hundred fifty-three AD cases and 736 control cases satisfied subject quality control (Figure 1) and constitute the Canadian data set (Table 1). The study was limited to white persons of northern European ancestry. Patients with AD satisfied NINCDS-ADRDA (National Institute of Neurological and Communicative Diseases and Stroke/Alzheimer's Disease and Related Disorders Association)7 and Diagnostic and Statistical Manual of Mental Disorders (Fourth Edition)8 criteria for probable AD, with a Global Deterioration Scale score of 3 to 7 (ranging from mild to very severe cognitive decline).9 Ethnically matched control subjects were recruited from nonbiological relatives, friends, or spouses of the cases. Control subjects had no history of symptoms of memory impairment. They had a Mini-Mental State Examination score higher than the appropriate cutoff for dementia taking into account age and education level,10 a Mattis Dementia Rating Scale score of 136 or higher,11 a Clock Test score (11:10) without error,12 and no impairment on the 7 instrumental activities of daily living questions from the Duke Older American Resources and Services Procedures caused by cognitive decline.13
The 120 SNPs with the numerically lowest multivariable P values for statistical association with case-control status and with age at onset were examined in a second data set drawn from 453 AD cases and 472 nondemented control cases recruited from July 1, 2001, to June 10, 2004, at Cardiff University, Cardiff, Wales, and King's College London, London, England, sites of the United Kingdom (UK) Medical Research Council (MRC) Genetic Resource for Late-Onset AD. Four hundred eighteen AD cases and 249 control cases satisfied subject quality control (Figure 1) and constitute the UK MRC data set (Table 1). All of the individuals were of Caucasian origin. Patients with AD had a minimum age at onset of 60 years and a diagnosis of probable AD (by NINCDS-ADRDA criteria), whereas control subjects were ascertained at the age of 65 years or older and had no evidence of dementia (by a Mini-Mental State Examination score ≥ 28, Clinical Dementia Rating score of 0,14 and full neurological examination).
Genotypes from the 500 566 SNPs in the GeneChip Human Mapping 500K Array Set (Sty and Nsp chips) (Affymetrix, Santa Clara, California) were obtained according to the Affymetrix published protocol using the Bayesian robust linear model with Mahalanobis distance algorithm (Affymetrix Power Tools version 1.4.0).15 In the Canadian data set, a total of 469 438 SNPs (459 975 autosomal and 9463 X-linked) passed the prespecified genotype quality control process, which excluded SNPs that were monomorphic (23 379 SNPs), had low genotype efficiency (148 SNPs with genotypes in < 70% of individuals), deviated from Hardy-Weinberg equilibrium at P < 10−7 (4824 SNPs), or had mapping issues (3863 SNPs). All of the SNP mapping was carried out prior to March 15, 2007.
An additional genotype quality control step for SNPs significant in both data sets involved visual assessment of genotype clusters from the 2-dimensional plots of the mean adjusted intensity for each allele probe. Plots without distinct separation of genotypes suggest a low-quality marker and possible genotypic misclassification.
The Fisher exact test was used to assess genotypic associations between AD and each of the SNPs without covariate adjustment (SAS statistical software version 8.2; SAS Institute, Inc, Cary, North Carolina). The test was applied to the 2 × 3 table of case-control status by SNP genotype.
In the Canadian data set, logistic regression was used to examine the effect of each SNP on the logarithmic odds of AD status adjusted for sex, education (ordinal categories of < 5, 5-10, 11-15, and > 15 years), age, number of APOE ε4 alleles, study site, and ancestry (French Canadian vs not) (SAS statistical software version 8.2). The SNPs with minor allele homozygote counts of 14 or more were tested as dominant, additive, and recessive coding in separate logistic regression models; the optimal genetic model for each SNP was selected by Schwarz-Bayesian information criterion (BIC)16 with the respective minimum Wald test P value. The minimum Wald P value for each SNP was multiplied by 3 as a Bonferroni adjustment for the 3 genetic models tested, yielding the adjusted minimum P value.17 For SNPs with minor allele homozygote counts less than 14 and the counts of minor allele homozygotes and heterozygotes that were more than 14, only the dominant genetic model could be tested with adequate cell counts with no further adjustment of the P value. The significance of an SNP × APOE interaction was also tested for each SNP in the optimal genetic model.
For nonautosomal SNPs, the analysis was performed stratified by sex. In women, dominant, additive, and recessive models were tested for SNPs with appropriate genotype frequencies. In men, only the dominant model (equal to the allelic model) was evaluated.
The top 120 SNPs by adjusted minimum P values were examined in the UK MRC data set under the same genetic model that was optimal for that SNP in the Canadian data set. For a dominant risk model, the UK MRC data set had an optimal power of approximately 80% to detect a genotypic odds ratio of 1.6 at P < .05 for a risk SNP with a minor allele frequency of 0.25 assuming disease prevalence of 5%.18 Given the smaller sample size from the UK MRC collection, we report those SNPs among the Canadian top 120 that were nominally significant at P < .05 in the UK MRC data set.
In the Canadian data set, Cox proportional hazards regression assessed the effect of each SNP genotype on age at onset of AD (SAS statistical software version 8.2). Controls were censored at the age individuals entered the study. Covariates were sex, ethnicity, number of APOE ε4 alleles, and study site. Education did not satisfy the proportional hazards assumption (by interaction terms with age); therefore, the analysis was stratified for education (collapsed into 3 categories: ≤ 10, 11-15, and > 15 years). Genetic models were compared by BIC, and adjusted minimum P values were reported as for the logistic regression model. The top 120 SNPs by adjusted minimum P values were examined in the UK MRC data set under the same genetic model that was optimal for that SNP in the Canadian data set. The SNP × APOE interaction was not tested in the Cox model.
To determine the statistical evidence for association of selected genes in the literature reported to be associated with AD and to exploit linkage disequilibrium (LD), we performed a gene-based permutation test incorporating covariate strata for SNPs within these genes. The permutation test determined the significance of the BIC test statistic for the most significant SNP in a gene, adjusted for the number of SNPs analyzed, the number of tests conducted, and the correlation between SNPs within each gene. Gene-based SNPs were defined by those within the most 5′ and 3′ exons of the longest gene transcripts.
For each permutation, disease status was shuffled among the AD cases and control cases within covariate strata, maintaining the overall number of AD cases and number of control cases in the observed data. The genetic data and covariates for each subject were not altered. For each permutation, all of the SNPs within a gene were analyzed using the same analytical tests applied to the true observed data. The minimum ΔBIC (among all of the tests for each gene) for the best-fit test was captured for each permutation. The permutations were repeated up to 2000 times such that up to 2000 minimum ΔBICs were captured. Once the permutations were completed, the minimum observed ΔBIC for the most significant SNP in the gene was compared against the permutation distribution of minimum ΔBIC. The proportion of minimum ΔBICs that were less than the minimum observed ΔBICs yielded the empirical permutation P value for that gene.
The genomic control variance inflation factor λ between AD and control cases by allelic and genotypic Fisher exact tests of 2889 uncorrelated markers were between 1.00 and 1.06 within each data set.19 This suggests minimal confounding by population stratification.
Within the Canadian data set, age (P < .001), sex (P = .005), education (P < .001), and number of APOE ε4 alleles (P < .001; odds ratio, 4.9 per additional ε4 allele) were independently associated with case-control status by logistic regression; site (P < .001) and APOE ε4 (P < .001; hazard ratio, 2.6) were independently associated with age at onset by Cox proportional hazards regression. French Canadian ethnicity was not significantly associated with age at onset or diagnosis. In the UK MRC data set, age (P < .001), education (P < .001), and APOE ε4 (P < .001; odds ratio, 4.6) were independently associated with case-control status, and APOE ε4 (P < .001; hazard ratio, 2.0) was associated with age at onset.
One SNP was associated with AD by genotypic Fisher exact test after genomewide Bonferroni adjustment: RS4420638 with nominal P = 2.3 × 10−44 within the APOC1 gene (NM_001645), which was in strong LD with APOE on chromosome 19. This SNP was no longer significant after adjustment for the number of APOE ε4 alleles in the logistic and Cox regression analyses (adjusted minimum P > .32).
The 120 most significant SNPs by adjusted minimum P value in the Canadian data set were evaluated in the UK MRC data set. In the UK MRC data set, 3 of these SNPs had the following: (1) allele frequencies sufficient to allow multivariable analysis; (2) nominal significance at P < .05; (3) a risk genotype consistent with the Canadian data set; and (4) reliable genotyping by intensity plots: RS7019241 and RS10868366 (both within the same LD block in GOLPH2 [NM_016548] on chromosome 9), and RS9886784 (within a copy number deletion polymorphism on chromosome 9). Results and genomic contexts for these SNPs are in Table 2 and Figure 2A and B.
We examined whether the number of APOE ε4 alleles significantly and reproducibly modified the association of SNPs with AD. No top 25 SNPs by SNP × APOE interaction P value had significant interaction P values in the UK MRC data set, although power for the interaction test was low.
The 120 most significant SNPs by adjusted minimum P value in the Canadian data set were evaluated in the UK MRC data set. One SNP was further supported in the UK MRC data set (according to the same conditions listed in the logistic regression results stated earlier): RS10519262 (intergenic on chromosome 15 between ATP8B4 [NM_024837] and SLC27A2 [NM_003645]). Results, genomic context, and Kaplan-Meier curves for RS10519262 in the Canadian and UK MRC data sets are shown in Table 2, Figure 2C, and Figure 3.
We examined SNPs within 26 genes with reported associations to AD by meta-analysis in the literature (Table 3).4 Six genes contained SNPs with an adjusted minimum P < .05 by logistic regression, of which only PRNP (NM_000311) (RS6017516, permutation P = .02) remained significant after the gene-based permutation test. Six genes contained SNPs with adjusted minimum P < .05 by Cox proportional hazards regression, of which only PSEN1 (NM_000021) (RS3025787, permutation P = .03) and SORCS1 (NM_001013031) (RS601883, permutation P = .02) remained significant after gene-based permutation (Table 3).
The availability of microarray platforms for genotyping thousands of SNPs across the genome provides an unprecedented opportunity to understand the genetic contributions to complex diseases. At least 2 high-density genomewide association studies are published in AD.20- 22 The association of RS4420638 within APOC1 with AD was also identified by Coon et al.20 The SNP is in strong LD with APOE, and its effects are eliminated by adjusting for the number of APOE ε4 alleles in the logistic and Cox regression analyses.
While there has been a tacit assumption that the association of APOE with AD is entirely explained by APOE ε4, other polymorphisms and regulatory elements within this LD block may influence AD pathogenesis, mitochondrial function, and drug response.23 The apolipoprotein E4 protein, and in particular a proteolytic E4 (1-272) fragment, binds to mitochondria and disrupts mitochondrial intracellular transport and synaptogenesis.23- 25 Translocase of outer mitochondrial membrane 40 (TOMM40), also within the APOE LD region, forms mitochondrial import channels that accumulate amyloid precursor protein and result in mitochondrial dysfunction, potentially triggering the caspase apoptotic cascade.26 We have previously reported that SNPs near the peroxisome proliferator-activated receptor γ regulatory region in APOE (RS439401) further array response to rosiglitazone maleate therapy in AD within APOE genotypes.27,28 This is being examined in 3 large, ongoing, registration clinical trials scheduled to conclude in late 2008. The fact that TOMM40 (NM_006114) and the peroxisome proliferator-activated receptor γ response element are in LD with the APOE locus has led to an extensive long-range sequencing study to define evolutionary haplotypes within the APOE ε2, ε3, and ε4 backbones. Thus, other genetic variations may be mapped in an evolutionary context for contributing to the worsening effect of APOE ε4 or the beneficial effect of APOE ε2 and including differential effects of APOE ε3.
Four SNPs within the top 120 by P value from the multivariable adjusted analyses in the Canadian data set provided additional evidence for association within the UK MRC data set. RS7019241 and RS10868366 reside within introns of the GOLPH2 gene that encodes a type II Golgi transmembrane trafficking protein.29 The GOLPH2 gene is within the chromosomal region 9.22 to 9.34 with suggestive linkage to AD in the literature.30,31 Golgi phosphoprotein 2 is strongly expressed in the dentate gyrus and CA3 subfields of mouse hippocampus.32 The identified SNPs are in strong LD with SNPs within the GOLPH2 proximal promoter region and could potentially alter gene expression levels (Figure 2A).
RS10519262 is intergenic on chromosome 15 between ATP8B4 and SLC27A2. One polymorphism (RS7176805) in strong LD with this SNP extends into the ATP8B4 distal promoter region, potentially altering a CCAAT box transcription factor binding site (Figure 2C). ATP8B4 is an adenosine triphosphatase involved in phospholipid transport within the cell membrane, with low levels of expression in hippocampus, caudate, substantia nigra, and cerebellum.33
Controlling type I error (false positives) while preserving sufficient power to detect true associations in genomewide association analyses is challenging. Genomewide Bonferroni correction tends to be overly conservative.34,35 Instead, we used a 2-stage procedure to identify promising SNPs. Other analytic approaches may offer additional power (eg, joint analysis36); however, large-scale replication is ultimately required to identify and confirm the modest effect sizes of genetic polymorphisms on common diseases.35 Additional exploratory analyses (eg, based on haplotypes or pathway analyses) may generate further genetic hypotheses for testing.37
Thus, apart from variations within the APOE LD regions, genetic associations with sporadic AD appear weak. One hypothesis is that AD is the common clinical and neuropathological response to a broad range of etiological genetic and environmental factors; many genes (potentially including those suggested by meta-analysis and supported by our data) and alleles (as reported for SORL138,39) would then be expected to differentially influence disease across diverse populations. It is also possible that the elusive other genetic effects may be located in the TOMM40-APOE-APOC1 LD region, perhaps with specific TOMM40 polymorphisms being important contributors to the overall genetic effect ascribed to APOE ε4. The peroxisome proliferator-activated receptor γ response element within this LD region may further influence pathogenesis or response to drugs. To date, the magnitude of the association of this LD block with AD is uniquely consistent and reproducible and therefore should receive increased scientific scrutiny in light of the otherwise weak associations in the current generation of genomewide association studies.
Correspondence: Rachel A. Gibson, PhD, GlaxoSmithKline, New Frontiers Science Park, Third Avenue, Harlow CM19 5AW, England (email@example.com).
Published Online: November 12, 2007 (doi:10.1001/archneurol.2007.3).
Accepted for Publication: September 4, 2007.
Author Contributions:Study concept and design: L. Li, St. Jean, Upmanyu, Surh, Hosford, Ehm, Feldman, Hammond, Keren, Loy-English, Richardson, Saunders, St. George-Hyslop, Swartz, Gibson, Irizarry, Middleton, and Roses. Acquisition of data: H. Li, L. Li, Briley, Borrie, Coletta, Delisle, Dhalla, Feldman, Fornazzari, Gauthier, Goodgame, Guzman, Hammond, Hollingworth, Hsiung, Johnson, Kelly, Keren, Kertesz, King, Lovestone, Loy-English, Matthews, Owen, Pryse-Phillips, Prinjha, Slater, St. George-Hyslop, Stinnett, Taylor, Wherrett, Williams, Yarnall, Gibson, Irizarry, and Middleton. Analysis and interpretation of data: H. Li, Wetten, L. Li, St. Jean, Upmanyu, Barnes, Feldman, Hollingworth, Hsiung, Matthews, Plumpton, Prinjha, Saunders, Gibson, and Irizarry. Drafting of the manuscript: H. Li, St. Jean, Upmanyu, Barnes, Matthews, Plumpton, Prinjha, Richardson, Gibson, and Irizarry. Critical revision of the manuscript for important intellectual content: Wetten, L. Li, St. Jean, Upmanyu, Surh, Hosford, Barnes, Briley, Borrie, Coletta, Delisle, Dhalla, Ehm, Feldman, Fornazzari, Gauthier, Goodgame, Guzman, Hammond, Hollingworth, Hsiung, Johnson, Kelly, Keren, Kertesz, King, Lovestone, Loy-English, Owen, Pryse-Phillips, Prinjha, Saunders, Slater, St. George-Hyslop, Stinnett, Swartz, Taylor, Wherrett, Williams, Yarnall, Gibson, Middleton, and Roses. Statistical analysis: H. Li, Wetten, L. Li, St. Jean, Upmanyu, Barnes, Ehm, Hollingworth, Saunders, and Irizarry. Obtained funding: Lovestone, Owen, Williams, and Roses. Administrative, technical, and material support: Hosford, Briley, Coletta, Dhalla, Feldman, Goodgame, Guzman, Hammond, Kelly, Kertesz, King, Loy-English, Matthews, Pryse-Phillips, Prinjha, Stinnett, Swartz, Taylor, Williams, Yarnall, and Gibson. Study supervision: Surh, Coletta, Dhalla, Ehm, Fornazzari, Hsiung, Kertesz, Lovestone, Matthews, Owen, Prinjha, Wherrett, Gibson, Irizarry, Middleton, and Roses.
Financial Disclosure: Drs H. Li, L. Li, St. Jean, Surh, Hosford, Barnes, Ehm, Matthews, Plumpton, Pryse-Phillips, Prinjha, Richardson, Saunders, Swartz, Yarnall, Gibson, Irizarry, Middleton, and Roses, Mss Wetten, Upmanyu, Coletta, Dhalla, Hammond, Johnson, King, Stinnett, and Taylor, and Messrs Briley, Goodgame, Kelly, and Slater are stock- and option-holding employees of GlaxoSmithKline. Dr Feldman is a paid consultant for GlaxoSmithKline. GlaxoSmithKline has drug and biomarker development programs in Alzheimer disease. Dr Gauthier was a consultant for GlaxoSmithKline at the time of the study.
Funding/Support: The study was funded by GlaxoSmithKline.
Role of the Sponsor: The sponsor was involved in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, and approval of the manuscript.
Additional Contributions: Kevin Canning, PhD, Lukasz Mis, MBiotech, David Krakovsky, BScPhm, PharmD, and Kevin Fehr, PhD, GlaxoSmithKline, Inc, Mississauga, Ontario, Canada, Zaheer Anwar, BSc, Jamie Logan, BSc, Sarah Flaskett, BSc, Tina Stapleton, Adam Whittaker, BSc, and Ramya Viknaraja, MSc, GlaxoSmithKline, Harlow and Greenford, England, Zaven Khachaturian, PhD, Khachaturian, Radebaugh and Associates, Inc, Potomac, Maryland, Jill Ratchford, BS-MT, and Stephanie Shouse, BS, GlaxoSmithKline, Research Triangle Park, North Carolina, Xin Yuan, PhD, GlaxoSmithKline, King of Prussia, Pennsylvania, and Nawab Qizilbash, MBChB, MRCP, DPhil, GlaxoSmithKline, Harlow, and Oxon Clinical Epidemiology, Ltd, London, England, provided administrative, technical, and material support. Kevin Canning, PhD, Lukasz Mis, MBiotech, and David Krakovsky, BScPhm, PharmD, GlaxoSmithKline, Inc, Mississauga, and Jamie Logan, BSc, and Sarah Flaskett, BSc, GlaxoSmithKline, Harlow and Greenford, acquired data. Julie Davidson, MPH, Nicholas Galwey, PhD, Thomas E. Nichols, PhD, and David Wille, PhD, GlaxoSmithKline, Harlow and Greenford, and Matthew R. Nelson, PhD, and Silviu-Alin Bacanu, PhD, GlaxoSmithKline, Research Triangle Park, provided genetic analysis advice.