Flow diagram of the study. EPQR-N indicates Eysenck Personality Questionnaire Neuroticism Scale; GWAS, genomewide association study; NEO PI-R, Neuroticism, Extraversion, Openness Personality Inventory–Revised; NIMH, National Institute of Mental Health; QC, quality control; SNP, single-nucleotide polymorphism.
van den Oord EJCG, Kuo P, Hartmann AM, Webb BT, Möller H, Hettema JM, Giegling I, Bukszár J, Rujescu D. Genomewide Association Analysis Followed by a Replication Study Implicates a Novel Candidate Gene for Neuroticism. Arch Gen Psychiatry. 2008;65(9):1062–1071. doi:10.1001/archpsyc.65.9.1062
Neuroticism is a trait that reflects a tendency toward negative mood states. It has long been linked to internalizing psychiatric conditions, such as anxiety and depression, and it accounts for much of the substantial comorbidity seen between these disorders.
To identify common genetic variants that affect neuroticism to better understand (the comorbidity between) a broad range of psychiatric disorders and to develop effective treatments.
Design, Setting, and Participants
More than 420 000 genetic markers were tested for their association with neuroticism in a genomewide association study (GWAS). The GWAS sample consisted of 1227 healthy individuals ascertained from a US national sampling frame and available from the National Institute of Mental Health genetics repository. The most promising markers were subsequently tested in a German replication sample comprising 1880 healthy individuals.
Main Outcome Measures
A strict definition of replication (same marker, same direction of effects, and same measure) combined with a threshold we proposed previously for declaring significance in genetic studies that ensures a mean probability of producing false-positive findings of less than 10%.
The most promising results in the GWAS and replication samples were single-nucleotide polymorphisms (SNPs) in the gene MAMDC1. These SNPs all tagged the same 2 haplotypes and had P values of 10−5 to 10−6 in the GWAS sample and of .006 to .02 in the replication sample. Furthermore, the replication involved the same SNPs and the same direction of effects. In a combined analysis of all data, several SNPs were significant according to the threshold that allows for 10% false-positive findings.
The small effect sizes may limit the prognostic, diagnostic, and therapeutic use of SNP markers such as those in MAMDC1. However, the present study demonstrates the potential of a GWAS to discover potentially important pathogenic pathways for which clinically more powerful (bio)markers may eventually be developed.
Major depression and the anxiety disorders, such as generalized anxiety disorder, panic disorder, and phobias, are highly prevalent,1 possessing substantial burdens of distress and impairment.2 Symptoms of depression and anxiety commonly co-occur.3 Genetic and nongenetic factors have been implicated in their etiology, but much of the pathway from risk to illness remains poorly understood.
There have been suggestions that clinical phenotypes may not reflect the underlying pathophysiologic processes involved in the etiology of psychiatric disorders4 and, therefore, may not provide optimum targets for gene-finding strategies. One approach to overcome this limitation is to investigate potential intermediate measures, or endophenotypes, that may more robustly reflect processes proximal to gene expression. Consequently, they may provide a simpler substrate for genetic analysis than the disease syndrome itself. In a 2003 review of the subject, Gottesman and Gould5 suggested a set of criteria that a putative endophenotypic measure must satisfy, including association with illness, heritability, and cosegregation (ie, shared genetic factors) with the illness in families.
One possible endophenotype is neuroticism.6 Neuroticism, a personality trait that reflects a tendency toward negative mood states,7 is 1 of the 3 key dimensions of personality (together with extraversion and psychoticism) according to Eysenck and Eysenck8 and has been included in most theories of personality since its introduction. Studies have consistently demonstrated associations between an individual's level of neuroticism and the likelihood of having symptoms or syndromes of depression, anxiety, or both (see Widiger and Trull9 and Brandes and Bienvenu10 for reviews). Data from a variety of studies, including 2 large, population-based, longitudinal samples,11,12 indicate that neuroticism acts as a premorbid vulnerability trait for major depression. In addition, available evidence11,13- 15 suggests that neuroticism scores increase during a depressive episode (“state effect”), with more limited and conflicting support for postepisode neuroticism scores exceeding premorbid levels (“scar effect”). Furthermore, several studies16- 18 support the hypothesis that neuroticism mediates some of the elevated comorbidity between depressive and anxiety disorders. Twin studies have estimated the heritability of neuroticism (the proportion of individual differences in neuroticism due to genetic factors) to be 0.3 to 0.6,19- 21 similar to that of depressive and anxiety disorders.22,23 Accumulating evidence24- 28 suggests that some of the genetic factors that underlie individual differences in neuroticism are the same as those that increase susceptibility to these debilitating conditions.
Although considerable effort has been devoted to identifying genomic regions that affect neuroticism and confer risk to depressive and anxiety disorders, we know little about the specific genetic variants that affect neuroticism. For example, although a variety of linkage scans for neuroticism have implicated broad chromosomal regions that may overlap those from scans involving major depression or panic disorder, there is no consensus about exactly which locations or specific genes to focus on.6 Candidate gene association studies29,30 have had mixed success, with only a handful of potential genes tested and little solid evidence of replication. However, for the most widely studied locus, the 5-HTTLPR polymorphism occurring in the promoter region of the serotonin transporter gene, meta-analyses31,32 of available data suggest that there are small but statistically significant effects on neuroticism.
In the past 2 years, genomewide association study (GWAS) has become technically and economically feasible. These studies entail screening hundreds of thousands to a million single-nucleotide polymorphisms (SNPs) across the genome for their association with the outcome of interest. It is now clear that GWAS can be a successful strategy because there have been multiple successes with the identification of highly compelling candidate genes for age-related macular degeneration,33 body mass index,34 inflammatory bowel disease,35 and type 2 diabetes mellitus.36- 38 One GWAS for neuroticism has been reported on DNA pools from 2000 individuals selected on extremes of neuroticism scores from a cohort of 88 142 people from southwest England.39 Although the pooling of DNA considerably reduces genotyping costs, the genotype quality remains hard to assess and could potentially interfere with the ability to identify genetic variants, particularly if effect sizes are small.
In this study, we first screen more than 420 000 genetic markers for their association with neuroticism. The sample consisted of 1227 healthy individuals ascertained from a US national sampling frame and available from the National Institute of Mental Health (NIMH) genetics repository. False-positive results can occur from genotyping errors, population stratification, and (probably most important) chance.40 To reduce the likelihood of false-positive findings, we performed a replication study41 by genotyping the most promising SNPs in a large independent sample of approximately 1800 healthy individuals.
The Figure displays a flow diagram of the study. In the following subsections, we outline each of the steps in detail.
Participants came from the “control” sample in the NIMH genetics repository and were originally part of a large schizophrenia study (Molecular Genetics of Schizophrenia; principal investigator, Pablo V. Gejman, MD). Field work was performed by Knowledge Networks, a survey and market research company whose panel contains approximately 60 000 households (>120 000 unrelated adults). Households are selected via random-digit dialing. Only 1 member per household is selected for the NIMH sample used in the present study so that participants are unrelated. The panel is representative of the US population (except for a slight bias toward higher income and educational level). A detailed sample description can be found at http://www.nimhgenetics.org (see controls under available data tab).
All the participants completed an online self-report screening interview after giving informed consent and before venipuncture was arranged. The consent included an authorization for qualified scientists, with approval from the NIMH, to study any condition or trait. All the biological samples, phenotypes, and genotypes were deidentified before deposit into the NIMH repositories.
The self-report screening procedure could be completed in approximately 20 minutes and included the Composite International Diagnostic Interview Short Form (CIDI-SF) (lifetime version, modified for the DSM-IV). Diagnostic classifications made using the full CIDI, known to give very good test-retest and interrater reliability,42,43 can be reproduced with good accuracy using the CIDI-SF scales.44 A further discussion of the CIDI-SF in the NIMH sample can be found elsewehere.45
The 12-item short version of the Eysenck Personality Questionnaire Neuroticism Scale46 (EPQR-N) was used to assess neuroticism. An estimated 89% to 95% of the information present in the 23-item scale is reflected in the 12-item measure.47 The Cronbach α was .87 in the entire NIMH sample. Generally speaking, this means that an estimated 87% of the total variation in EPQR-N scores is explained by variation in neuroticism, with the remainder being measurement error and unrelated variance. Thus, the EPQR-N has good internal consistency reliability.
Peripheral venous blood samples were sent to the Rutgers University Cell and DNA Repository (Piscataway, New Jersey), where cell lines were established via Epstein-Barr virus transformation. Numerous quality control procedures are routine, and the success rate for immortalization exceeds 99% (http://www.rucdr.org/quality.htm). Sample DNA concentrations were quantified and normalized using double-stranded DNA quantitation kits (PicoGreen; Molecular Probes, Eugene, Oregon).
Genotype data were generated at the Center for Genotyping and Analysis at the Broad Institute of Harvard and MIT as part of a multi-institutional collaborative research study. The Affymetrix GeneChip Human Mapping 500K “A” array set was used, which is composed of 2 arrays that enable genotyping more than 500 000 SNPs with a single primer (http://www.affymetrix.com/support/technical/datasheets/500k_datasheet.pdf). Genotypes were called using the BRLMM algorithm (Affymetrix, Santa Clara, California). DNA samples that achieved a BRLMM call rate of less than 95% and individuals with unusual degrees of relatedness or heterozygosity were excluded.
As part of the self-report screening procedure, participants were asked whether they had ever been diagnosed as having, or received treatment for, schizophrenia, schizoaffective disorder, or bipolar disorder. The exclusion criterion for whole genome genotyping was a positive response to any of these 3 questions. To study the impact of this exclusion criterion, each of the 12 EPQR-N items were scored 0 (does not describe) or 1 (does describe). Using all individuals currently in the NIMH control sample, the mean (SD) EPQR-N score of the 4016 participants who did not satisfy the exclusion criterion was 3.50 (3.35), and the mean (SD) score of the 201 who satisfied the exclusion criterion was 7.86 (3.33). This large difference of more than 1 SD was significant (P = 2.2 × 10−16, r2 = 0.071). One implication is a restriction of the range of neuroticism scores in the GWAS sample that could affect the power to detect SNPs negatively. On the other hand, some of this loss in power may be offset by less contamination of neuroticism scores in the high range by potential confounders, such as psychiatric illness (ie, increased “error” variance). To get an impression of the overall effect of these 2 phenomena, we recalculated the Cronbach α after applying the exclusion criterion; the α was now .86 vs .87 in the whole sample. This suggests that only a small overall reduction in the relative proportion of variance was attributable to neuroticism in the GWAS sample, making it reasonable to assume that much of the power was maintained.
Unrelated volunteers with 2 German parents were randomly selected from the general population of Munich and were contacted by mail. Individuals who responded were initially screened by telephone for the absence of neuropsychiatric disorders. Next, detailed medical and psychiatric histories were assessed for the individuals themselves and their first-degree relatives by using a semistructured interview. If no exclusion criteria were fulfilled, individuals were invited for a comprehensive interview, including the Structured Clinical Interview for DSM-IV48,49 to validate the absence of a lifetime psychiatric disorder. In addition, the Family History Assessment Module50 was conducted to exclude psychiatric disorders in their first-degree relatives. Finally, a neurologic examination was conducted to exclude individuals with current central nervous system impairment, and the Mini-Mental-Status-Test51 was performed in individuals older than 60 years to exclude those with possible cognitive impairment.
The Neuroticism, Extraversion, Openness Personality Inventory–Revised (NEO PI-R)52 was administered in the replication sample. For replication, it is important that similar measures are used as much as possible in the 2 samples. However, the NEO PI-R neuroticism scale, consisting of 48 items, measures a much broader variety of behaviors, thoughts, and feelings compared with the 12-item version of the EPQR-N used in the GWAS sample. For example, whereas the short version of the EPQR-N mainly focuses on anxiety and depression, the NEO PI-R neuroticism scale also includes impulsiveness and hostility subscales. Indeed, when we factor analyzed the 6 NEO PI-R neuroticism subscales, the anxiety (0.81), depression (0.83), and vulnerability to stress (0.81) subscales all had much higher loadings than the impulsiveness (0.35), hostility (0.64), and self-consciousness (0.59) subscales. This finding also seems to be consistent with observations that impulsive and hostile behaviors may be associated with other personality dimensions.53 To maximize agreement between the neuroticism measures in the 2 samples that focuses on the core features, we constructed a more homogeneous NEO PI-R neuroticism scale comprising the subscales with factor loadings greater than 0.8 only (anxiety, depression, and vulnerability to stress). The Cronbach α for this scale was .92, indicating high internal consistency reliability. The selection of participants in the German sample is considerably more stringent than that in the GWAS sample. This restricted range could reduce power compared with the GWAS sample. On the other hand, the higher internal consistency suggests that this restricted range may be compensated for by a smaller error component in the outcome measure.
Peripheral venous blood samples were obtained, and DNA was extracted using a commercial kit (QIAamp DNA Blood Maxi Kit; Qiagen, Hilden, Germany) and standard procedures. DNA concentration was adjusted using a quantitation reagent (PicoGreen; Invitrogen, Karlsruhe, Germany), and 1 ng was genotyped using an assay (iPLEX) on a mass spectrometer (MassARRAY MALDI-TOF; Sequenom, Hamburg, Germany) at the Genetics Research Centre GmbH in Munich. Genotyping call rates were all greater than 91% (mean, 97%).
To control the risk of false discoveries, we calculated for each P value a so-called Q value (see the supplemental materials at http://www.vipbg.vcu.edu/˜edwin/).54,55 A Q value is an estimate of the proportion of false discoveries among all significant markers (ie, Q values are false discovery rates) when the corresponding P value is used as the threshold for declaring significance. As argued previously,56 we preferred this false discovery rate–based approach because it (1) provides a good balance between the competing goals of finding true effects vs controlling false discoveries; (2) allows the use of more similar standards in the proportion of false discoveries produced across studies because it is much less affected by number of (sets of) tests, which is an arbitrary factor; (3) is relatively robust against having correlated tests55,57- 64; and (4) gives a more subtle picture of the possible role of the tested markers rather than an all-or-nothing conclusion about whether a study produces significant results.
Although all the participants were white, there could be genetic subgroups in the US sample (eg, immigrated from different parts of Europe). Sullivan et al65 performed an extensive evaluation of multiple statistical methods to avoid false-positive findings in GWAS due to such genetic subgroups. They concluded that the principal components and multidimensional scaling (MDS) approaches were similar and superior to the other approaches. In this study, we chose MDS because it is implemented in PLINK,66 which had already been used for other analyses. Input data for the MDS approach were the genomewide average proportion of alleles shared identical by state between any 2 individuals. This genetic similarity matrix is analyzed where the first MDS dimension captures the maximal variance in the genetic similarity, the second dimension must be orthogonal to the first and captures the maximum amount of residual genetic similarity, and so on. To determine the number of required dimensions, we first calculated for each participant a quantitative score on each MDS dimension. This score provides a quantitative indication of the importance of a particular dimension for an individual's genetic profile (eg, a high score on a dimension representing northwest European ancestry means that a participant is likely to originate from that region). Next, we determined the number of SNPs that were significantly associated with each MDS dimension. When controlling the false-discovery rate at the 0.1 level, the number of significant SNPs was 102 083, 45 158, and 12 428 for dimensions 1, 2, and 3, respectively. In contrast, relatively few (approximately 800) SNPs were significantly associated with dimensions 4, 5, etc. This suggested that the first 3 dimensions captured most of the genetic substructure in the sample. To control for possible effects of genetic substructure, these 3 quantitative MDS dimensions were included as covariates in the multiple regression analyses testing whether individual SNPs were associated with neuroticism.
Sex and age are 2 key covariates of neuroticism. A “brute force” approach to handling these covariates would be to include them as covariates in the regression analyses testing the association between SNPs and neuroticism. However, the effect of an autosomal SNP may be different across sex and age groups. In this situation, regressing out sex and age will reduce the power to detect the SNP effects.
An alternative approach is based on the observation that individuals from different sex and age groups may respond differently to individual items for reasons that are unrelated to their level of neuroticism. For example, the item “Do you often feel lonely?” could be more frequently endorsed by old people as a result of fewer social interactions, and the item “Are your feelings easily hurt?” may be more frequently endorsed by women, who may express emotions more easily. Following Neale et al,67 we corrected the item responses when significant “item bias” effects were found (see section 2 of the supplemental material at http://www.vipbg.vcu.edu/˜edwin/). In addition, to further improve power, we estimated factor scores where items that are better indicators of neuroticism are given more “weight” in the calculation of the neuroticism scores.
Following Sullivan,68 we used a strict definition of replication, meaning that we required an association in the replication study with (1) the same phenotype (eg, excluding other NEO PI-R scales), (2) the same SNP (eg, excluding other SNPs in the same gene), and (3) the same direction of effects (excluding SNPs that have significant effects in the opposite direction). For such markers, we could then use a standard threshold, such as P < .05, for declaring significance in the replication sample. However, this procedure for declaring significance is not optimal because there will be a loss of power because the GWAS data are ignored.7,60- 62 Furthermore, although the number of markers tested in the replication study is arbitrary (ie, determined by available resources), it directly affects the number of “replications”69 if a fixed threshold of .05 is used. We, therefore, also generalized the previously discussed Q value to the situation in which data are combined across the 2 samples. This generalization involves several complexities (eg, for this purpose, we needed to derive the distribution of the Wald test statistic when data are combined across stages), and details are given in section 3 of the supplemental material (http://www.vipbg.vcu.edu/˜edwin/). We previously proposed the Q value threshold of .1 for declaring significance.56 The same threshold can be applied here, which implies that, on average, we allow 10% of the SNPs that are declared significant to be false discoveries.
From the available genotypes, we deleted the 8444 SNPs that had more than 10% missing genotypes and the 57 862 SNPs with minor allele frequencies smaller than 0.005 in the neuroticism sample. The SNPs can be out of Hardy-Weinberg equilibrium (HWE) for many reasons other than genotyping errors (eg, due to assortative mating, natural selection, or genetic drift). To eliminate only those markers that were dramatically out of HWE and minimize false positives, the threshold for HWE quality control was set to P < 1 × 10−5 (eg, meaning that the expected number of false positives is approximately 10−5 × 420 000 = 4 SNPs). This resulted in the exclusion of another 13 975 SNPs. Because this threshold for HWE testing is somewhat arbitrary, note that using neither P < 1 × 10−6 nor P < 1 × 10−4 changed anything in terms of SNP selection for the follow-up study. These selections reduced the total number of SNPs from 500 568 to 420 287.
Participants with greater than 10% missing SNPs were also deleted. Furthermore, we used PLINK to generate the identity-by-state matrix for all pairwise combinations of autosomal SNPs. One individual was deleted because his identity-by-state profile was very different (approximately 25-30 SDs) from any other profile in the sample. This left a sample of 1227 participants.
Table 1 provides a more detailed description of the 2 samples. Both samples included white persons only. For the sociodemographic variables (eg, marital status and educational level), there were statistically significant differences between the US and German samples.
The 420 287 autosomal SNPs passing quality control were tested for association using regression analyses assuming an additive effect of the SNP, where the first 3 MDS dimensions were included as covariates. A file with all the P values can be downloaded from one of us (E.J.C.G.v.d.O.) (http://www.vipbg.vcu.edu/˜edwin). The minimum P values were 1.52 × 10−6 and 1.18 × 10−6 for the raw and MF scores, respectively. The better results for MF scores may be the result of improved power to detect genetic effects. The estimated proportion of markers without effect was p0 = 1.0000 for the raw scores and p0 = 0.9999824 for the MF scores. Because (1 − p0) times the number of markers estimates the number of effects, this suggested 0 to 7 SNPs with effects. The best Q values were .38 and .16 for the raw and MF scores, respectively. These values were fairly close to the threshold of .1 proposed previously56 (ie, a scenario where, on average, 10% of the significant findings are allowed to be false discoveries) for declaring significance in genetic studies.
Table 2 provides the 25 SNPs ranked in ascending order on the basis of their P values in the replication study. Mainly dictated by practical considerations, we selected SNPs with Q values smaller than .6 for the replication study and then added further SNPs to make optimal use of the plate layout. The minor allele frequencies of the 63 selected SNPs were similar in the GWAS and the replication study, with a mean absolute difference of 0.01 (maximum absolute difference, 0.038). This suggested that the GWAS and replication samples were comparable in terms of allele frequencies and supported the GWAS genotyping accuracy because a different genotyping platform was used in the replication study.
Table 3 provides the results for the top 25 SNPs in the replication study ranked in ascending order of their P values. The smallest P values were for markers in MAMDC1. These SNPs also gave the lowest P values in the GWAS (Table 2). Consistent with a strict definition of replication, the replication involved the same phenotype, the same SNPs, and the same direction of effects. Furthermore, P values for all MAMDC1 SNPs were smaller than the commonly used threshold of .05. Exploratory analyses of all the NEO PI-R neuroticism subscales showed the smallest P values for the anxiety, depression, and vulnerability to stress subscales. This suggested that MAMDC1 was mainly associated with the core features of neuroticism.
All 4 SNPs in MAMDC1 were in very high linkage disequilibrium (see the supplemental material at http://www.vipbg.vcu.edu/˜edwin), and in both samples, 2 haplotypes accounted for 94% of all the haplotypic variation. The frequencies of the 2 haplotypes were 38% and 56% in the GWAS sample vs 36% and 58% in the replication sample. Because each SNP “tags” these 2 haplotypes, these haplotype analyses merely duplicated the single SNP analyses. The SNP rs1959813 was slightly out of HWE. However, given that this SNP also tags the 2 haplotypes, this is probably sampling or a small systematic genotyping error.
Table 3 also shows that for 22 of the top 25 SNPs, the direction of effects was identical to that observed in the GWAS. This suggested the possibility of additional markers with (small) effects. For example, an SNP in NXPH1 was also significant at a P < .05 level and showed a similar direction of effects in the replication study.
Table 4 provides the results of the combined analyses. The lowest P values were now in the range of 2.0 × 10−7 for the SNPs in MAMDC1. For most SNPs, P values improved (eg, AK127771), but this was not always the case (eg, TECTA). Q values for several SNPs in MAMDC1 were smaller than .1. This satisfies the threshold of .1 (ie, a scenario in which, on average, 10% of the significant findings are allowed to be false discoveries) we proposed previously for declaring significance in genetic studies.56
The most promising results in the GWAS and replication samples were 4 SNPs in MAMDC1. Consistent with a strict definition of replication, association involved the same phenotype, the same SNPs, and the same direction of effects. These SNPs all tagged the same 2 haplotypes and had P values of 10−5 to 10−6 in the GWAS sample and P values of .006 to .02 in the replication sample. In a combined analysis, the SNPs were significant according to the previously specified threshold, allowing for 10% false-positive findings. The replication study used a different recruitment strategy, a different genotyping technology, and a somewhat different measure of neuroticism and involved a sample from a different continent. This suggests that the finding may be robust.
Two linkage scans for bipolar disorder70,71 have previously implicated the region where MAMDC1 is located. Logarithm of odds scores were very modest in both linkage scans (˜1.59), which is not inconsistent because a common variant with such a small effect size is highly unlikely to be amenable to linkage analysis.
MAMDC1, also known as MDGA2, is a recently described gene that is expressed in a variety of human tissues, including the nervous system, and is proposed to be involved in regulating neuronal migration and axonal guidance.72 A closely related gene, MDGA1, has been studied in greater detail and is important in forebrain development, including being expressed in somatosensory areas.73- 76 MDGA1 and MDGA2 are members of the immunoglobulin domain cell adhesion molecule subfamily, which includes neural cell adhesion molecules such as NCAM1 and L1CAM. Although the function of MDGA2 is not well understood, other neuronal cell adhesion molecules have been implicated in psychiatric disorders,77 neuronal stress response,78 and, specifically, depression.79 Associated SNPs in MDGA2 are located in a region of high linkage disequilibrium that extends 37 kb and includes the 10th exon of MDGA2. Further details of the linkage disequilibrium structure in the gene are contained in supplemental Figures 1 and 2 (http://www.vipbg.vcu.edu/˜edwin). In addition to covering an exon, the associated region contains several regions that are conserved across 17 vertebrate species80 and at least 1 that is indistinguishable from the exon in terms of conservation. Therefore, several intervals in the associated region contain the potential to harbor function variants, but, owing to the high linkage disequilibrium, further refinement using the present sample will be difficult.
The effect sizes of the MAMDC1 SNPs were very small in the replication sample (eg, explaining 0.25%-0.40% of the variation in neuroticism). This can be expected: because of the sampling error, the effect sizes in the initial study are often larger than in subsequent replications.81,82 However, the somewhat different measures in the 2 samples and possible differences in the effect of the SNPs in US vs German participants may also have played a role. Small effect sizes could also be inherent to the genetic architecture of neuroticism. Given that neuroticism is perceived to be an endophenotype6 for multiple psychiatric conditions that may provide a simpler substrate for genetic analysis than the disease syndrome itself, the latter explanation would imply that very large samples sizes will be needed to detect genetic effects on psychiatric conditions directly.
If very small effect sizes are the rule rather than the exception, it may be important to mention some of the other results. For example, an SNP in NXPH1 also showed fairly small P values in both samples, identical direction of effects, and Q values of approximately .22 to .30 in the combined analyses. Neurexophilin 1 (NXPH1) binds α-neurexins, which promote adhesion between dendrites and axons. Neurexins have been associated with nicotine and alcohol dependence.83,84 Neurexins also induce γ-aminobutyric acid postsynaptic differentiation,85 and the γ-aminobutyric acid system is a well-known target for studies of depression and anxiety. α-Neurexins are also required for postsynaptic N-methyl-D-aspartic acid receptor function.86 Although less studied than the γ-aminobutyric acid neurotransmitter system, the glutamate system is becoming the focus of more studies of anxiety-related traits. Furthermore, N-methyl-D-aspartic acid–type glutamate receptors are directly implicated in depression and anxiety. For example, mice with the NR2A N-methyl-D-aspartic acid receptor inactivated show reduced anxiety and depressivelike behaviors.87 Neurexins also bind many other molecules at the synapse, including synaptotagmins, which, when knocked out in mice, also produce reduced anxiety and depressionlike behavior.88
None of the top genes from the present study were among the best results in the previous pooled GWAS for neuroticism.39 However, because effect sizes are likely to be small and DNA pooling may further reduce power, this cannot be perceived as a nonreplication. The 5-HTTLPR variant in SLC6A4 has been linked to neuroticism.89 Although this variant was not genotyped because it is an insertion/deletion, the best P value of the 8 SNPs in this gene was .05.
From a clinical perspective, small effect sizes limit the prognostic, diagnostic, and therapeutic use of SNP markers. On the other hand, the detected SNPs could point to novel pathogenic pathways for which clinically more powerful (bio)markers may eventually be found. In this respect, the present MAMDC1 findings illustrate the potential of GWAS to discover such novel pathways using SNPs that can be measured reliably in a cost-effective manner using biomaterial that is easy to collect.
Correspondence: Edwin J. C. G. van den Oord, PhD, Center for Biomarker Research and Personalized Medicine, Medical College of Virginia of Virginia Commonwealth University, PO Box 980533, Richmond, VA 23298-0533 (email@example.com).
Submitted for Publication: December 18, 2007; final revision received February 26, 2008; accepted April 14, 2008.
Author Contributions: Dr van den Oord had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Financial Disclosure: None reported.
Additional Contributions: Biomaterials and phenotypic data were obtained from the following projects that participated in the NIMH “control” samples: control subjects from the NIMH Schizophrenia Genetics Initiative and data and biomaterials collected by the Molecular Genetics of Schizophrenia II collaboration. The investigators and coinvestigators are as follows: Evanston Northwestern Healthcare/Northwestern University (grant MH059571), Pablo V. Gejman, MD (collaboration coordinator and principal investigator), Alan R. Sanders, MD; Emory University School of Medicine (grant MH59587), Farooq Amin, MD (principal investigator); Louisiana State University Health Sciences Center (grant MH067257), Nancy Buccola, APRN, BC, MSN (principal investigator); University of California–Irvine (grant MH60870), William Byerley, MD (principal investigator); Washington University (grant U01, MH060879), C. Robert Cloninger, MD (principal investigator); University of Iowa (grant MH59566), Raymond Crowe, MD (principal investigator), Donald Black, MD; University of Colorado (grant MH059565), Robert Freedman, MD (principal investigator); University of Pennsylvania (grant MH061675), Douglas Levinson, MD (principal investigator); University of Queensland (grant MH059588), Bryan Mowry, MD (principal investigator); and Mt Sinai School of Medicine (grant MH59586), Jeremy Silverman, PhD (principal investigator). The samples were collected by V. L. Nimgaonkar's group at the University of Pittsburgh as part of a multi-institutional collaborative research project with Jordan Smoller, MD, DSc, and Pamela Sklar, MD, PhD (Massachusetts General Hospital) (grant MH 63420). Genotype data were generated at the Center for Genotyping and Analysis at the Broad Institute of Harvard and MIT as part of a multi-institutional collaborative research study (principal investigator: Pamela Sklar, MD, PhD; Jordan Smoller, MD, ScD; Vishwajit Nimgaonkar, MD, PhD; and Edward Scolnick, MD). We thank all the coworkers at the Department of Psychiatry, Ludwig-Maximilians-University, Munich, for their excellent contribution to the characterization of the participants and the laboratory work. MALDI-TOF genotyping of the replication sample was conducted at the Genetics Research Centre GmbH, which is a joint initiative between GlaxoSmithKline and the Department of Psychiatry, Ludwig-Maximilians-University. Mike Neale, PhD, and Steve Aggen, PhD, contributed to the development of the Mx code for taking measurement invariance into account. Most important, we thank the individuals who have participated in and contributed to these studies.