Before (A) and after (B) quality control quantile-quantile plots of Cochran-Armitage trend test P values of autosomal single-nucleotide polymorphisms (SNPs). The SNPs at which the test statistic exceeds 30 are represented by triangles at the top of the plot.
Genome-wide overview of the genome-wide association study findings. The single-nucleotide polymorphisms at which the test statistic exceeds 30 are represented by the triangle at the top of the plot.
Illustration of P values in a ±1-megabase interval around the most significant findings at 2q35: rs1344694, rs7590720, and rs705648. Gene annotations are from University of California Santa Cruz RefSeq genes; linkage disequilibrium maps are from the International HapMap Project. SNP indicates single-nucleotide polymorphism.
Illustration of P values in a ±1-megabase interval around the most significant finding, rs1864982, at 5q32. Gene annotations are from University of California Santa Cruz RefSeq genes; linkage disequilibrium maps are from the International HapMap Project. SNP indicates single-nucleotide polymorphism.
Illustration of P values in a ±1-megabase interval around the most significant finding, rs12388359, at Xp22.2. Gene annotations are from University of California Santa Cruz RefSeq genes; linkage disequilibrium maps are from the International HapMap Project. SNP indicates single-nucleotide polymorphism.
Treutlein J, Cichon S, Ridinger M, Wodarz N, Soyka M, Zill P, Maier W, Moessner R, Gaebel W, Dahmen N, Fehr C, Scherbaum N, Steffens M, Ludwig KU, Frank J, Wichmann HE, Schreiber S, Dragano N, Sommer WH, Leonardi-Essmann F, Lourdusamy A, Gebicke-Haerter P, Wienker TF, Sullivan PF, Nöthen MM, Kiefer F, Spanagel R, Mann K, Rietschel M. Genome-wide Association Study of Alcohol Dependence. Arch Gen Psychiatry. 2009;66(7):773-784. doi:10.1001/archgenpsychiatry.2009.83
Alcohol dependence is a serious and common public health problem. It is well established that genetic factors play a major role in the development of this disorder. Identification of genes that contribute to alcohol dependence will improve our understanding of the mechanisms that underlie this disorder.
To identify susceptibility genes for alcohol dependence through a genome-wide association study (GWAS) and a follow-up study in a population of German male inpatients with an early age at onset.
The GWAS tested 524 396 single-nucleotide polymorphisms (SNPs). All SNPs with P < 10−4 were subjected to the follow-up study. In addition, nominally significant SNPs from genes that had also shown expression changes in rat brains after long-term alcohol consumption were selected for the follow-up step.
Five university hospitals in southern and central Germany.
The GWAS included 487 male inpatients with alcohol dependence as defined by the DSM-IV and an age at onset younger than 28 years and 1358 population-based control individuals. The follow-up study included 1024 male inpatients and 996 age-matched male controls. All the participants were of German descent.
Main Outcome Measures
Significant association findings in the GWAS and follow-up study with the same alleles.
The GWAS produced 121 SNPs with nominal P < 10−4. These, together with 19 additional SNPs from homologues of rat genes showing differential expression, were genotyped in the follow-up sample. Fifteen SNPs showed significant association with the same allele as in the GWAS. In the combined analysis, 2 closely linked intergenic SNPs met genome-wide significance (rs7590720, P = 9.72 × 10−9; rs1344694, P = 1.69 × 10−8). They are located on chromosome region 2q35, which has been implicated in linkage studies for alcohol phenotypes. Nine SNPs were located in genes, including the CDH13 and ADH1C genes, that have been reported to be associated with alcohol dependence.
This is the first GWAS and follow-up study to identify a genome-wide significant association in alcohol dependence. Further independent studies are required to confirm these findings.
Alcohol dependence is characterized by a cluster of cognitive, behavioral, and physiologic symptoms, with an affected individual continuing to drink despite significant alcohol-induced impairment or distress. Because alcohol affects most human organs, its abuse is associated with a wide range of types of physical, mental, and social harm. According to the World Health Organization,1 alcohol abuse constitutes a serious public health problem worldwide, accounting for 4% of the global burden, a burden comparable with the death and disability attributable to tobacco use and hypertension.1,2
Alcohol dependence is a phenotypically heterogeneous disorder that runs in families and has high genetic loading. Twin and adoption studies3- 5 have shown that 40% to 60% of the interindividual phenotypic variance is accounted for by genetic factors. In congruence with the observed phenotypic heterogeneity of alcohol dependence, vulnerability to alcohol dependence on the molecular level is thought to be mediated by many genetic loci of small to modest effects in Europeans.6- 14 Despite strenuous efforts and numerous linkage and candidate gene studies, identification of the underlying susceptibility genes has proved to be difficult. Genome-wide association studies (GWASs) conducted in other complex disorders have been shown to be a successful tool in identifying underlying susceptibility genes (for all published GWASs, see http://www.genome.gov/26525384) and have already resulted in the identification of genetic susceptibility variants for psychiatric disorders such as nicotine addiction, schizophrenia, and bipolar disorder,15- 24 which were mostly not detected using other diagnosis-based discovery approaches. For alcohol dependence, only one GWAS using pooled DNA samples has been published to date, and it has proposed several new susceptibility loci for alcohol dependence. Their products are implicated in cellular signaling, gene regulation, development, and cell adhesion.25 Convergent translational approaches integrate genetic findings from animal models with a candidate gene or GWAS approach in humans and have proved to be very successful.26,27 The goal of the present study is to conduct a GWAS and a follow-up study for alcohol dependence using individual genotyping in samples stratified for homogeneity with respect to sex, ethnicity, age at onset, and recruitment procedures. To increase the explanatory power of the findings, we applied a convergent functional genomics28 approach to integrate findings from gene expression data in alcohol-dependent rats with the GWAS findings.
Patients were recruited from consecutive admissions to the psychiatric and addiction medicine departments of 5 different study centers at university hospitals across southern and central Germany: Regensburg, Mannheim, Munich, Bonn/Essen/Dusseldorf/Homburg, and Mainz. These centers are members of the German Addiction Research Network (GARN; http://www.bw-suchtweb.de), for which one of us (K.M.) is the spokesperson. All the patients included in the study had alcohol dependence of such severity that hospitalization for the treatment or prevention of severe withdrawal symptoms was warranted. All the patients received a diagnosis of alcohol dependence (per DSM-IV criteria) by the consensus of 2 clinical psychiatrists. In the German Addiction Research Network study, DSM-IV criteria were ascertained systematically through the use of semistructured and independently rated interviews conducted by trained staff members.
In Munich and Bonn, patients were assessed using the Semi-Structured Assessment for the Genetics of Alcoholism29; in Mainz, the Composite International Diagnostic Interview30 was used; and in Regensburg, Essen/Dusseldorf/Homburg, and Mannheim, the Structured Clinical Interview for DSM-IV31 was used. The latter tool was also applied in Munich. All the patients and controls were of self-reported German ancestry. The study was approved by all the relevant local ethics committees, and all the participants provided written informed consent.
To increase the power of the study, we increased the homogeneity of the patient sample by including male patients only. Furthermore, only patients with an early age at onset of alcohol dependence were included, a feature that has higher heritability in males.32,33 Patients included in the GWAS (Mannheim, n = 98; Bonn/Essen/Dusseldorf/Homburg, n = 30; Mainz, n = 30; Munich, n = 53; and Regensburg, n = 276) had an age at onset younger than 28 years (median, 20 years; mean [SD], 21 [4.1] years), defined as the age at which DSM-IV criteria for alcohol dependence were fulfilled for the first time.
The follow-up sample consisted of 1024 German males recruited at the same sites and under the same protocol as the patients in the GWAS (Mannheim, n = 256; Bonn/Essen/Dusseldorf/Homburg, n = 149; Mainz, n = 63; Munich, n = 136; and Regensburg, n = 420). Because most of the patients with an earlier age at onset had already been included in the GWAS, age at onset for the follow-up study was necessarily less stringent and was defined as younger than 45 years (median, 30 years; mean [SD], 29.3  years). Phenotypes and genotypes resulting from the GWAS are stored in a comprehensive database at the Institute for Medical Biometry, Informatics, and Epidemiology in Bonn.
A total of 1358 control subjects were included in the GWAS. Controls were taken from 3 population-based epidemiologic studies: 487 from PopGen,34 488 from KORA-gen,35 and 383 from the Heinz Nixdorf Risk Factors, Evaluation of Coronary Calcium, and Lifestyle (RECALL) Study.36 These 3 recruitment areas are located in Schleswig-Holstein (northern Germany), Augsburg (southern Germany), and Essen/Bochum/Mulheim (Ruhr Area, western Germany). The Population-Based Recruitment of Patients and Controls for the Analysis of Complex Genotype-Phenotype Relationships (PopGen) project (http://www.science.ngfn.de/10_233.htm; http://www.popgen.de) was initiated to provide all locally prevalent cases with the diseases in question for the disease-oriented projects from the National Genome Research Project (NGFN) and population-based control samples that will ultimately be comprised of 7200 persons. The Cooperative Health Research in the Region of Augsburg (KORA-gen) project (http://www.science.ngfn.de/10_234.htm; http://www.gsf.de/KORA), which has evolved from the World Health Organization MONICA (Monitoring of Trends and Determinants of Cardiovascular Disease) study, contains the biosamples, phenotypic characteristics, and environmental parameters of 18 000 adults from Augsburg and the surrounding regions. The biological specimen bank was established to enable researchers to perform epidemiologic research regarding molecular and genetic factors. The Heinz Nixdorf RECALL Study (http://www.recall-studie.uni-essen.de) was started in the year 2000 as a prospective cohort study to determine predictors of coronary heart disease. It includes the biosamples, phenotypic characteristics, and environmental parameters of 4500 adults. PopGen and KORA-gen are the two major German biobanking resources available to participants in the NGFN (http://www.ngfn.de/) for use as universal controls,37 and they have already been used as samples in other GWASs.37,38
Male control individuals for the follow-up study were drawn from the KORA-gen study (n = 382) and from a population-based sample (n = 614) that had been collected in the area of Bonn by one of us (M.R.) for use as a control sample for association studies in the framework of the NGFN. Controls were not stratified for alcohol abuse or dependence, which may introduce a conservative bias. Tests for genetic differentiation between German populations have shown low levels of population substructure, thus demonstrating that the German population is an appropriate source for use in association studies of complex diseases.39
Genomic DNA was prepared from whole blood according to standard procedures. For all samples, DNA concentrations were quantified in triplicate measurements using the PicoGreen dsDNA Quantitation Kit (Invitrogen Corporation, Carlsbad, California) and normalized to 50 ng/μL at the Central Institute of Mental Health Molecular Genetics Laboratory. For the GWAS, all samples were genotyped individually using HumanHap 550 BeadChips (Illumina Inc, San Diego, California). The patient and the PopGen and KORA-gen samples were genotyped at Illumina Inc, and the Heinz Nixdorf RECALL Study sample was genotyped at the Department of Genomics at the Life and Brain Center, University of Bonn. Genotyping of the follow-up sample was performed by means of primer extension reaction chemistry with matrix-assisted laser desorption/ionization time-of-flight mass spectrometry using the iPLEX Assay (Sequenom Inc, San Diego) at the Life and Brain Center, University of Bonn. Genotyping results were imported into a central computer system for statistical analysis.
Data analysis and quality control (QC) were performed using the software packages R version 2.5.1 (http://www.R-project.org) and PLINK version 220.127.116.11 In the GWAS, genotype data were cleaned before analysis by removing single-nucleotide polymorphisms (SNPs) or individuals not fulfilling the QC criteria, which included a SNP call proportion of at least 95%, a subject completeness proportion of at least 95%, a SNP minor allele frequency of at least 0.01, and SNP conformity with Hardy-Weinberg equilibrium expectations (P ≥ .01 in controls). To correct for cryptic relatedness, all pairs of individuals displaying an identity-by-state value larger than 1.6 were marked, and for each pair, the individual with the lower typing rate was removed from the analysis. Cochran-Armitage trend statistics were used to calculate significant association for autosomal SNPs (eTable 1; for all supplementary materials, see http://www.zi-mannheim.de/pub_gwas.html). To visualize the outcome of the QC steps, Cochran-Armitage P values were depicted in a quantile-quantile plot (Figure 1). We observed good adherence of P values to the line of expectance, which implies that potential spurious associations characterized by an inflation of highly significant P values were successfully removed by the QC measures. The remaining slight deviations from the line of expectance are interpreted to include true genetic effects. Further correction for λ41 improved the quantile-quantile plot of Cochran-Armitage P values (eFigure 1).
Three groups of 2- to 3-month-old alcohol-preferring rats were used for long-term alcohol consumption and gene expression profiling: male P rats (n = 15) (Indiana University, Indianapolis), male HAD rats (n = 13) (Indiana University), and male AA rats (n = 14) (National Public Health Institute, Helsinki, Finland). Each rat strain shows alcohol preference due to various neurochemical alterations, as indicated by the abbreviations P (preferring), HAD (high alcohol drinking), and AA (Alkoalcohol addicted).42,43 All experimental procedures were approved by the Committee on Animal Care and Use (Regierungspräsidium Karlsruhe) and were performed in accordance with the local Animal Welfare Act and the European Communities Council Directive of November 24, 1986 (86/609/EEC).
According to the protocol of Vengeliene et al,44,45 8 P rats, 7 HAD rats, and 7 AA rats were given ad libitum access to tap water and to 5% and 20% ethanol solution (vol/vol). All the rats underwent a 2-week deprivation cycle after 8 weeks of continuous alcohol availability. After the deprivation period, rats were given access to alcohol again, and 3 more 2-week deprivation periods were introduced in a random manner (the duration between deprivation periods varied between 4 and 16 weeks). The long-term voluntary alcohol-drinking procedure, including all deprivation phases, lasted 52 weeks. Total ethanol intake (grams per kilogram of body weight per day) was calculated as the daily average across 7 measuring days. For comparison, 3 age- and weight-matched control groups, consisting of 7 P rats, 6 HAD rats, and 7 AA rats, experienced identical handling procedures for the entire duration of the experiment but did not receive alcohol.
Preparation of brain samples and RNA isolation are described in the eText. Target preparation was performed for individual samples from the caudate putamen and the amygdala using 5 μg of total RNA. Hybridization to RG_U34A arrays, staining, washing, and scanning of the chips were performed according to the manufacturer's technical manual (Affymetrix, Santa Clara, California).
Microarray Suite 5.0 (Affymetrix)–derived cell intensity files were processed using the R 2.1.1 language and environment (http://www.R-project.org) and Bioconductor 1.646 packages. Each array was inspected for regional hybridization bias and QC parameters as recently described.47 Fifty-three arrays (27 from the caudate putamen and 26 from the amygdala) passed through the quality filter and were included in the statistical analysis. Of the 8799 probe sets on the RG_U34A arrays, only those with intensity values greater than 100 in at least 25% of the samples were retained (6344 probe sets). A 3-way analysis of variance (ANOVA) was used to identify differentially expressed genes across strain, brain region, and treatment. The list of genes affected by long-term ethanol consumption included those with a P < .05 for treatment from the 3-way analysis of variance. Post hoc analysis was performed via template matching across strains. Correlation coefficients (r) were calculated for consistent upregulation or downregulation by ethanol in the caudate putamen, the amygdala, or both.
The SNPs found to be associated with the disorder in the GWAS with P < 1 × 10−3 or found to lie in a SNP cluster (defined as ≥2 SNPs with P ≤ 1 × 10−2 and at least 1 SNP with P < 1 × 10−3 in a distance of <30 kilobase [kb]) (examples of which can be seen in eFigures 2, 3, and 4) were taken forward to the follow-up study if they were located in a homologue of a rat gene showing differential brain expression with long-term high ethanol consumption compared with their respective controls without ethanol access. Only 1 SNP was selected for each gene. If more than 1 SNP in the gene fulfilled the previously mentioned criteria, the SNP with the highest number of assigned transcripts (Sullivan et al annotation file: https://slep.unc.edu/evidence/) was taken forward. In cases of similar transcript numbers, the SNP with the lowest P value was chosen.
In the animal study, we used a convergent translational approach with 2 strategic lines to integrate the rat data:
a qualitative approach looking for orthologous or paralogous genes in the GWAS and expression profiling in rats (see the list of genes in eTable 2 and eTable 3).
a (semiquantitative) ranking strategy, as theoretically described by Bertsch et al,28 in which those orthologous or paralogous genes were included. In this strategy, a weighting ratio of 1 (both data sets with equal weights) was used, multiplying the P values of respective rat and human genes.
Using strategy 1, 19 additional SNPs were identified and were ranked according to strategy 2. Because the total number of SNPs derived through the animal approach was only 19, we eventually decided to include all of them in the follow-up study instead of selecting a small subset based on the ranking order of genes. The 3 genes that were confirmed by the follow-up study (Table 1) had been ranked No. 7 (ADH) (alcohol dehydrogenase 1C, gamma polypeptide) (OMIM *103730), No. 12 (CDH) (cadherin 13) (OMIM *601364), and No. 19 (GATA) (gata-binding protein 4) (OMIM *600576) using approach 2. We want to conclude from these data that it is hard to introduce quantitative evaluations in the convergent approaches. Nevertheless, convergent approaches as pursued in this study can provide valuable additional information and apparently become increasingly more popular.
We searched the Gene Ontology database (http://amigo.geneontology.org/cgi-bin/amigo/go.cgi) for the annotation of gene products for genes in which confirmed SNPs of the follow-up study are located. The Gene Ontology database provides a standardized vocabulary describing gene products to ensure uniformity across databases.48 We investigated whether these genes share distribution patterns in Gene Ontology categories by mapping them to Gene Ontology annotations.
We excluded 11 of the 487 GWAS cases with alcohol dependence from further analyses. Of these, 6 DNA samples failed to genotype, and, in 5 individuals, the genome-wide identity-by-state score was greater than 1.60, indicating possible cryptic relatedness. Of the 561 466 SNPs genotyped per individual, 99.6% passed QC (Figure 2). The final data set comprised 524 396 SNPs (511 701 autosomal and 12 695 sex-linked markers) in 476 male cases and 1358 male controls. The mean (SD) call rate of the final set of SNPs (cases and controls) was 99.76% (0.45%). A genome-wide overview of the GWAS association results is shown in Figure 2.
We used 2 complementary SNP selection strategies, a “lowest P value” strategy and a “rodent candidate gene” strategy, to prioritize SNPs for follow-up. In total, 139 SNPs were carried forward for genotyping in the follow-up sample, including all 121 SNPs with a P < 10−4 (Cochran-Armitage trend test for autosomes and allelic test for X chromosome), of which an assay design was possible for 120 (eTable 1), and an additional 19 SNPs with at least nominal significance, which would not have been accounted for by the lowest P value approach. These 19 SNPs are among 22 SNPs (3 of which were already represented in the aforementioned lowest P value selection) located in human homologues of rat genes showing differential expression in the rat brain after long-term alcohol consumption (for more details see the eText) and, therefore, have a higher a priori probability of being involved in the etiology of alcohol dependence.
In the follow-up study, 16 SNPs showed association with at least nominal significance (P < .05, 2-sided), 15 of which (9 intragenic and 6 intergenic) were associated with the same allele as in the GWAS. The number of significantly replicated SNPs is higher than expected by chance (P = 2.45 × 10−6). Three of the intragenic SNPs are derived from the 22 SNPs selected on the basis of the animal model (Table 1). This number is also significantly higher than that expected by chance (P = 1.69 × 10−2).
Combining the data across the 2 samples showed genome-wide significance (Bonferroni-adjusted α level .05/524 000 [P < 9.5 × 10-8])49,50 for 2 SNPs, rs7590720 (P = 9.72 × 10−9) and rs1344694 (P = 1.69 × 10−8). These SNPs map to the 3′-flanking region of the gene encoding peroxisomal trans-2-enoyl-coA [coenzyme A] reductase (PECR), located on chromosome region 2q35. They are in linkage disequilibrium with each other and with another SNP, rs705648 (P = 1.78 × 10−6), located in the PECR gene (rs1344694-rs7590720: D′ = 1.0, r2 = 0.739; rs1344694-rs705648: D′ = 0.943, r2 = 0.568; and rs7590720-rs705648: D′ = 0.948, r2 = 0.776; International HapMap Project).
A further 8 SNPs are located in genes: rs13362120 (combined P = 1.85 × 10−5) in the calpastatin (CAST) gene; rs13160562 (combined P=7.09 × 10−6) in the endoplasmatic reticulum aminopeptidase 1 (ERAP1) and calpastatin (CAST) genes; rs1864982 (combined P = 3.46 × 10−6) in the protein phosphatase 2 (formerly 2A), regulatory subunit B, beta isoform (PPP2R2B) gene; rs6902771 (combined P = 8.30 × 10−6) in the estrogen receptor 1 (ESR1) gene; rs7138291 (combined P = 3.68 × 10−5) in the coiled-coil domain containing the 41 (CCDC41) gene; rs1614972 (combined P = 1.41 × 10−4) in the alcohol dehydrogenase 1C (ADH1C) gene; rs13273672 (combined P = 4.75 × 10−4) located in the GATA binding protein 4 (GATA4) gene, and rs11640875 (combined P = 1.84 × 10−5) in the cadherin 13 (CDH13) gene. The first 5 SNPs were selected for the follow-up study using the lowest P value strategy and the latter 3 via the rodent candidate gene approach. These findings and those of the intergenic SNPs are provided in Table 1, Table 2, eTable 4, and eTable 5.51Figures 3, 4, and 5 show P values of the SNPs located in ±1-Mb intervals around the most significant findings at 2q35, 5q32, and Xp22.2.
Specific Gene Ontology database biological process, cellular component, and molecular function terms remained largely unique for each of the genes and were not found to be enriched in the genes analyzed. No specific coherent underlying process could be deduced from Gene Ontology database term associations. Consequently, we considered the literature-derived molecular function of each of the gene products of intragenic SNPs separately.
The GeneChip Rat Genome U34 Set (Affymetrix) provides gene expression data for more than 24 000 known genes and expressed sequence tag (EST) clusters for comprehensive coverage of the rat genome. The 2 brain regions were evaluated separately. Three-way analysis of variance revealed 542 genes differentially expressed at P < .05 in the caudate putamen, the amygdala, or both brain regions of alcohol-preferring P, HAD, and AA rats after 1 year of ethanol consumption. A detailed description of the rodent candidate gene strategy for selection of markers for the follow-up study is given in the “Integration of Animal Data Into the Human Study” subsection of the “Methods” section.
The present GWAS and follow-up study of alcohol dependence detected and replicated evidence for association with 15 markers. Two of these markers, rs7590720 and rs1344694, remained significant after genome-wide correction for multiple testing in the combined sample of 1460 patients and 2332 controls.
These 2 markers are located approximately 5 kb apart in chromosomal region 2q35, which has been implicated in alcohol dependence in previous linkage studies. Linkage to this region was found in a genome-wide search in 2282 individuals from 262 families with a high density of alcohol dependence in the Collaborative Study on the Genetics of Alcoholism (COGA).52 Near microsatellite marker D2S1371, the highest logarithm of odds (LOD) score for the comorbid alcoholism and depression phenotype was 4.12 (this LOD score, however, was obtained in only one data set, ie, in the replication data set, whereas in the initial and combined data sets, the LOD scores were 0.00 and 2.16, respectively).52 Marker D2S1371 is located approximately 1.4 Mb from rs7590720 and rs1344694. Another COGA project53 found linkage with a maximum LOD score of 2.4 between a low level of response to alcohol and the microsatellite marker D2S434, which is only approximately 1.7 Mb from markers rs7590720 and rs1344694. The low level of response to alcohol is an endophenotype related to heavy drinking and alcohol problems.54 It is genetically affected in animals55- 57 and humans,58,59 with an estimated heritability in humans of 40% to 50%.60 Two further linkage studies of the COGA project61,62 reported linkage with LOD scores of 3.28 and 3.0, respectively, between the same marker D2S434 and the P3(00) component of the event-related potential. Heritability of the P3 amplitude ranges from 50% to 80% (for a review, see the study by Porjesz et al62), and P3 deficits have been repeatedly reported to be a strong vulnerability factor for alcohol dependence, especially in males with high familial loading of alcohol dependence.63,64 In a linkage study conducted by Hill et al65 in 330 members of 65 families with a high density of alcohol dependence, linkage to 2q35 with marker D2S2382 was identified. This marker is located only approximately 0.2 Mb from the 2 main SNPs. A LOD score of 3.68 was observed in a 4-parameter model that included measurement of age, gender, and alcohol dependence along with measurement of constraint, a personality construct66 that seems to overlap with risk-taking behavior.
These linkage findings, which all satisfied the Lander and Kruglyak67 criteria for suggestive linkage, underline the potential importance of this chromosomal region. However, because no meta-analysis of linkage studies in alcohol addiction exists, the possibility that these findings are merely due to random noise cannot be ruled out.
The gene that is closest to the peak association signal and that also harbors another associated SNP, rs705648 (P = 1.78 × 10−6), is PECR, which seems to be a key enzyme of fatty acid metabolism as it catalyzes the reduction of medium-chain (unsaturated) enoyl-CoAs to saturated acyl-CoAs, which may then undergo oxidation in mitochondria for energy production. This pathway, originating from triglycerides, is particularly important in conditions of starvation, when the energy supply is switched from the use of glucose to fatty acids. In this condition, PECR needs to be upregulated. Triglyceride levels are typically increased in the blood of alcohol-dependent individuals.68,69PECR reduces C-C double bonds in not only even-chain but also in odd-chain fatty acids. On degradation to acetyl-CoA, odd-chain fatty acids end up with propionyl-CoA, which requires carboxylation to an even-chain fatty acid. This step is mediated through methyl-malonyl-CoA, an intermediate formed by the essential cofactors methyl-tetrahydrofolate as the methyl donor and vitamin B12. Folate deficiency, often observed in alcohol-dependent patients,70 may, therefore, interfere with this pathway and prevent the availability of a sufficient supply of acetyl-CoA. Expression of PECR is highest in the liver, followed by the kidney, muscle tissue, the lungs, and the heart but barely, if at all, in the brain.71 If this gene is involved in alcohol dependence, it must act in the periphery rather than in the central nervous system. It could, therefore, be expected that we detected PECR only in the GWAS, which identifies susceptibility genes independent of organ-specific expression, but not in the animal approach, which is brain specific.
A further 4 SNPs are intergenic, and another 8 are located in genes. No ontologic relatedness was detected between these genes. That is not surprising given the assumed genetic heterogeneity of alcohol addiction. In the eText, we discuss data from linkage findings, biochemistry, and animal studies that have provided evidence that these loci may be involved in alcohol dependence. However, none of these markers achieved genome-wide significance in this study. This does not necessarily disqualify them. Alcohol dependence is a complex and heterogeneous disorder that is affected by multiple genetic, biological, psychological, and sociocultural factors. Formal genetic and linkage studies do not indicate the existence of genes with major effects in the European population. Thus, genome-wide significant findings in moderately sized samples will be rare, but the number of replicable associations may actually be much larger. One replicated association that did not withstand correction for multiple testing was obtained with the SNP rs1614972 in the ADH1C gene. The ADH gene cluster is located on chromosomal region 4q, which is among the most consistently replicated loci contributing to alcohol phenotypes.72 The ADH1C gene belongs to the biologically relevant alcohol-metabolizing alcohol dehydrogenase (ADH) genes. The P value of this SNP, which was at position 282 of the top-down ranking, achieved a P = 2.84 × 10−4 in the GWAS. It would not have been carried forward into the follow-up study if it had not been supported by evidence from the animal study. To achieve a power of 80% to detect significant association withstanding correction for multiple testing, we would have needed to include 5573 patients and 8900 controls (regarding actual odds ratio and minor allele frequency of this SNP). The same is true for the replicated association to marker rs11640875, which was ranked 510 in the GWAS findings. It is located in the CDH13 gene, a gene that has been described in previous studies as a susceptibility locus for alcohol dependence25 and methamphetamine dependence.73 Increasing sample sizes is often not feasible and bears the risk of increasing clinical and genetic heterogeneity. Experience from GWASs of diabetes, for example, has shown that differences in assessment can jeopardize even the most robust findings.74 Obtaining very large sample sizes may necessitate the inclusion of samples from different cultural and ethnic backgrounds, which can also increase heterogeneity.
The ADH1C gene belongs to the ADH class I genes (ADH1A, ADH1B, and ADH1C), which have been extensively investigated for the risk of alcohol dependence, initially in Asian75- 77 and subsequently in patients with African ancestry78,79 and ethnic European80- 84 populations. A meta-analysis85 reported a significantly higher risk of alcohol dependence in carriers of the ADH1C*2 allele in East Asian populations (odds ratio, 1.91), but its role in the European population is less clear. The associated marker rs1614972 is in complete linkage disequilibrium (D′ = 1.0, r2 = 0.311, International HapMap Project) with 1 of the 2 polymorphisms of the ADH1C*2 protein isoform that has recently been investigated84 in 575 patients and 530 controls from the Irish Affected Sib Pair Study of Alcohol Dependence sample. The 2-marker haplotype composing markers rs1614972 and rs1693482 yielded evidence for association with alcohol dependence. Single-marker analysis with rs1614972, however, produced no evidence. Although the allele frequencies in ethnic Irish and ethnic German patients are the same, they differ between Irish and German control samples, with the allele frequencies between the German GWAS and the follow-up study being the same. This finding underlines the importance of sample homogeneity.
To increase the genetic homogeneity of the sample, and thus increase the power of the study, we attempted to select patients using as rigorous a set of criteria as possible. Patients of German descent were recruited in a joint research project and were stratified for male sex and an early age at onset, a combination of traits associated with increased heritability.32,33 Previous studies39,86,87 have shown that population stratification in Germans is small. Thus, at the time when SNPs for the follow-up study were chosen, we had not applied a formal test to control for population stratification. A post hoc principal components analysis88,89 (EigenSoft v1.01; http://genepath.med.harvard.edu/~reich/Software.htm) and genomic control analysis41 (showing a λ of 1.099) had no major impact on the significance level of the 15 SNPs (eTable 4), thus confirming these previous observations.
Alcohol dependence is a complex disorder, and association studies with more biologically refined phenotypes will help to disentangle this heterogeneity. The present approach of selecting more genetically homogeneous groups by stratifying for male sex and an earlier age at onset is a first step in this direction. We must point out, however, that to achieve an overall sample of more than 1500 patients, it was necessary to relax the early-age-at-onset criteria. Whereas in GWAS patients the upper inclusion limit was 28 years (median, 20 years), the upper inclusion limit in the replication sample had to be extended to 45 years (median, 30 years). Although, this cannot be considered as a specifically early age at onset, the sample as a whole and especially the GWAS sample had clearly been stratified for early age at onset.
In conclusion, we identified genome-wide significant association with 2 SNPs located in a chromosomal region for which linkage and animal studies provide compelling evidence for the presence of susceptibility variants predisposing an individual to alcohol addiction. Further independent studies are required to confirm these findings. The PECR gene, which is located close to these SNPs, is a plausible candidate and merits further investigation. We showed that the GWAS is a powerful tool for producing valuable findings in even a moderately sized sample when it has been carefully selected for genetically homogeneous cases. In addition, this study underlines the value of a convergent approach between animal studies and systematic genetic screens in humans to deal with the multiple testing problem in GWASs.
Correspondence: Marcella Rietschel, MD, Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health Mannheim, University of Heidelberg, J5, 68159 Mannheim, Germany (email@example.com).
Submitted for Publication: August 29, 2008; final revision received December 9, 2008; accepted January 21, 2009.
Financial Disclosure: None reported.
Funding/Support: This work was supported by grants from the German Federal Ministry of Education and Research: NGFN2 and NGFN-Plus FKZ 01GS0117 and 01GS08152 (Drs Mann, Nöthen, Rietschel, Schreiber, Spanagel, and Wichmann) and grants FKZ EB01011300 and 01EB0410 (Drs Mann and Spanagel), and by a grant from the Sixth Framework Program of the European Commission: Integrated Project IMAGEN IP-I3250 (Drs Mann, Rietschel, and Spanagel). Drs Cichon and Nöthen received support from the Alfried Krupp von Bohlen und Halbach-Stiftung. The Heinz Nixdorf RECALL Study was supported by a grant from the Heinz Nixdorf Foundation.
Previous Presentations: This study was presented at Deutscher Suchtkongress; June 14, 2008; Mannheim, Germany; at Drei-Länder-Symposium für Biologische Psychiatrie; October 9, 2008; Gottingen, Germany; at the XVI World Congress on Psychiatric Genetics; October 14, 2008; Osaka, Japan; and at the 1st Annual Meeting of NGFN-Plus and NGFN-Transfer in the Program of Medical Genome Research, Helmholtz Zentrum München/Neuherberg; December 12, 2008; Munich-Neuherberg, Germany.
Author Contributions: Drs Treutlein, Cichon, Ridinger, Spanagel, Mann, and Rietschel are co-primary authors of this article. They are responsible for distinct, but integral, parts of the study: the clinical study with homogenous assessment of the patients, the animal study, and the genetic study. They had full access to all of the data in the study and take joint responsibility for the integrity of the data and the accuracy of the data analysis.
Additional Information: The eText, eFigures, and eTables are available at http://www.zi-mannheim.de/pub_gwas.html.
Additional Contributions: Marina Füg, Christine Hohmeyer, and Rosa Ferrando provided expert technical assistance; Thomas G. Schulze, MD, Central Institute of Mental Health/National Institute of Mental Health Mood and Anxiety Disorders Program, provided helpful discussions; and Christine Schmael, MD, Central Institute of Mental Health, critically read the manuscript. The rats were provided by TK Li, PhD (Department of Psychiatry, Institute of Psychiatric Research, Indiana University School of Medicine) and D Sinclair, PhD (Department of Mental Health and Alcohol Research, National Public Health Institute).