Results from estimating depression in target samples using PRSs from different depression genome-wide association study (GWAS) discovery samples. Full indicates the total sample for each discovery GWAS, and downsampled each discovery GWAS downsampled to 7500 patients and 12 500 controls. For illustrative purposes, the PGC29 PRS effect size is plotted in both the full and downsampled panels, but this GWAS was not downsampled. Estimation of depression in the AGDS full (A) and downsampled (B) cohorts and the QSkin full (C) and downsampled (D) cohorts. The PGC 2019 PRS, which has the largest sample size, was the best estimator of depression in both cases assessed using DSM-5 criteria (A) and those assessed using a single self-report item (C). When sample sizes were equal, the lifetime MDD PRS was a better estimator of case status in those meeting DSM-5 criteria (B) but not in those assessed using minimal phenotyping (D) Circles indicate the odds ratio per SD in profile score with lines showing the 95% CIs. DepAll indicates self-report of seeing a general practitioner for nerves, anxiety, tension, or worry and at least 2 weeks of depression or anhedonia in the UK Biobank (21 777 cases and 58 396 controls); GPPsy, self-report of seeing a general practitioner for nerves, anxiety/tension, worry in the UK Biobank (113 262 cases and 219.360 controls); ICD-10, International Statistical Classification of Diseases, 10th Edition code for depression from linked electronic health records in UK Biobank (9176 cases, 203 235 controls); Lifetime MDD, patients meeting DSM-5 criteria for MDD in the UK Biobank and controls that screened negative for MDD (16 301 patients and 50 870 controls); PGC29, meta-analysis of cohorts from PGC-MDD study with clinical diagnoses from interviews or from clinicians (14 833 cases and 23 921 controls); PGC 2019, largest published GWAS of depression published to date (includes 246 819 clinically defined and minimally phenotyped patients and 561 485 controls); PsyPsy, self-report of seeing a psychiatrist for nerves, anxiety, tension, or worry in the UK Biobank (36 286 patients, 297 126 controls); SelfRepDep, self-report of history of depression in interview with trained nurses in the UK Biobank (19 805 cases, 234 114 controls), and UKB Broad, self-report of seeing a general practitioner or psychiatrist in the UK Biobank (113 769 cases, 208 811 controls).
Results from estimating comorbid physical disorders in patients with MDD in the Australian Genetics of Depression Study. Full indicates the total sample for each discovery genome-wide association study (GWAS). When sample sizes are equal, the lifetime MDD PRS was a better estimator of comorbid migraine (A) and chronic fatigue syndrome (B) and had the largest effect size for chronic pain (C). DepAll indicates self-report of seeing a general practitioner for nerves, anxiety, tension, or worry and at least 2 weeks of depression or anhedonia in the UK Biobank (21 777 cases and 58 396 controls); GPPsy, self-report of seeing a general practitioner for nerves, anxiety/tension, worry in the UK Biobank (113 262 cases and 219.360 controls); ICD-10, International Statistical Classification of Diseases, 10th Edition code for depression from linked electronic health records in UK Biobank (9176 cases, 203 235 controls); Lifetime MDD, patients meeting DSM-5 criteria for MDD in the UK Biobank and controls that screened negative for MDD (16 301 patients and 50 870 controls); PGC29, meta-analysis of cohorts from PGC-MDD study with clinical diagnoses from interviews or from clinicians (14 833 cases and 23 921 controls); PGC 2019, largest published GWAS of depression published to date (includes 246 819 clinically defined and minimally phenotyped patients and 561 485 controls); PsyPsy, self-report of seeing a psychiatrist for nerves, anxiety, tension, or worry in the UK Biobank (36 286 patients, 297 126 controls); SelfRepDep, self-report of history of depression in interview with trained nurses in the UK Biobank (19 805 cases, 234 114 controls), and UKB Broad, self-report of seeing a general practitioner or psychiatrist in the UK Biobank (113 769 cases, 208 811 controls).
eMethods. Detailed Methods
eTable 1. Clinical Phenotypes in Cases Meeting MDD Criteria In the Australian Genetics of Depression Study Analyzed in the Polygenic Risk Score Analyses
eTable 2. SNV-Based Heritability and Genetic Correlations for Definitions of Depression
eTable 3 Results From Polygenic Risk Estimation of MDD and Self-report Depression Cases Using Different Definitions of Depression
eTable 4. Results From Polygenic Risk Estimation of clinical phenotypes of MDD in the AGDS
eTable 5 Results From Polygenic Risk Estimation of Self-report Current Health in the AGDS
eTable 6 Results From Polygenic Risk Estimation of Comorbid Physical Conditions in Cases With MDD
eFigure 1. Schematic of the Design of the Study
eFigure 2. Results From Estimation of Depression in Target Samples Using Controls Screened for Depression Only
eFigure 3. Associations of PRS From Different Definitions of Depression With Clinical Features of Depression in MDD Cases
eFigure 4. Association Between PRS Derived From Different Definitions of Depression and Self-rated Physical and Mental Health Measured on a 5-Point Scale (1 = Very Poor, 5 = Very Good) in Participants in Australian Genetics of Depression Study
Customize your JAMA Network experience by selecting one or more topics from the list below.
Identify all potential conflicts of interest that might be relevant to your comment.
Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.
Err on the side of full disclosure.
If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.
Not all submitted comments are published. Please see our commenting policy for details.
Mitchell BL, Thorp JG, Wu Y, et al. Polygenic Risk Scores Derived From Varying Definitions of Depression and Risk of Depression. JAMA Psychiatry. 2021;78(10):1152–1160. doi:10.1001/jamapsychiatry.2021.1988
To what extent does the depth of phenotyping matter in genetic studies of depression?
In this case-control polygenic risk score analysis including 12 106 individuals with major depressive disorder, the major factor in estimating risk was sample size of the discovery genome-wide association studies. Polygenic risk scores derived from studies assessing diagnostic criteria for major depressive disorder had associations with higher odds ratios with somatic symptoms and comorbidities of major depressive disorder.
Results of this study suggest that to generate potential better genetic estimations of risk for severe depression, larger genome-wide association study sample sizes, regardless of the depth of phenotyping, should be prioritized.
Genetic studies with broad definitions of depression may not capture genetic risk specific to major depressive disorder (MDD), raising questions about how depression should be operationalized in future genetic studies.
To use a large, well-phenotyped single study of MDD to investigate how different definitions of depression used in genetic studies are associated with estimation of MDD and phenotypes of MDD, using polygenic risk scores (PRSs).
Design, Setting, and Participants
In this case-control polygenic risk score analysis, patients meeting diagnostic criteria for a diagnosis of MDD were drawn from the Australian Genetics of Depression Study, a cross-sectional, population-based study of depression, and controls and patients with self-reported depression were drawn from QSkin, a population-based cohort study. Data analyzed herein were collected before September 2018, and data analysis was conducted from September 10, 2020, to January 27, 2021.
Main Outcome and Measures
Polygenic risk scores generated from genome-wide association studies using different definitions of depression were evaluated for estimation of MDD in and within individuals with MDD for an association with age at onset, adverse childhood experiences, comorbid psychiatric and somatic disorders, and current physical and mental health.
Participants included 12 106 (71% female; mean age, 42.3 years; range, 18-88 years) patients meeting criteria for MDD and 12 621 (55% female; mean age, 60.9 years; range, 43-87 years) control participants with no history of psychiatric disorders. The effect size of the PRS was proportional to the discovery sample size, with the largest study having the largest effect size with the odds ratio for MDD (1.75; 95% CI, 1.73-1.77) per SD of PRS and the PRS derived from ICD-10 codes documented in hospitalization records in a population health cohort having the lowest odds ratio (1.14; 95% CI, 1.12-1.16). When accounting for differences in sample size, the PRS from a genome-wide association study of patients meeting diagnostic criteria for MDD and control participants was the best estimator of MDD, but not in those with self-reported depression, and associations with higher odds ratios with childhood adverse experiences and measures of somatic distress.
Conclusions and Relevance
These findings suggest that increasing sample sizes, regardless of the depth of phenotyping, may be most informative for estimating risk of depression. The next generation of genome-wide association studies should, like the Australian Genetics of Depression Study, have both large sample sizes and extensive phenotyping to capture genetic risk factors for MDD not identified by other definitions of depression.
Depression is a common, often recurrent or severe, psychiatric disorder and one of the leading causes of global disability.1 Depression is characterized by significant heterogeneity in timing of onset, symptom profile, course, response to treatment, and both psychiatric and physical comorbidities.2 Approximately 30% to 40% of the total variance in liability to major depressive disorder (MDD) is attributable to additive genetic factors.3 Since 2015, there have been a number of breakthroughs in identifying genetic risk factors for depression.4-8 In 2019, the Psychiatric Genomics Consortium (PGC) identified 102 independent variants associated with depression. In 2021, a meta-analysis including the Million Veterans Project identified 233 associated variants.9 To achieve the extensive sample sizes needed to identify these loci, a large proportion of cases were defined based on (1) responses to a single screening question regarding seeking professional help for depression, worries, or tension; (2) a self-reported diagnosis of depression during a nurse-led interview in the UK Biobank; (3) online assessment in 23andMe; or (4) a diagnosis from electronic health records (collectively referred to as minimal phenotyping). Thus, many of the individuals were either not assessed for or did not meet the criteria for MDD as defined by the DSM-5.10
Cai and colleagues11 found evidence for differences in genetic architecture between depression defined using minimal phenotyping and MDD assessed using a diagnostic questionnaire, including a higher heritability and lack of enrichment of association in genes expressed in the brain for clinically defined depression and nonspecificity of loci identified using minimal phenotyping. Including minimally phenotyped patients and controls thus substantially boosts power to detect genetic loci, but may increase heterogeneity within and across cohorts and so miss clinically important genetic effects specific to MDD.
The proliferation of large, population-based health studies with genomic information and the increasing availability of administrative health data with diagnostic codes for depression might facilitate valuable insights into the cause of depression. However, the extent to which genetic findings from depression defined by minimal phenotyping extend to clinical diagnoses of depression using diagnostic questionnaires or interviews is a key issue that will inform the interpretation and design of future studies.
Herein, we used the Australian Genetics of Depression Study (AGDS), a large online study of the genetic cause of depression,12 to investigate how polygenic risk scores (PRSs) constructed from different definitions of depression and meta-analyses encompassing multiple definitions map to specific features of clinical depression, such as age at onset, severity, reported trauma, and psychiatric and physical comorbidities. The large sample size and breadth of phenotyping make this a unique cohort for dissecting the genetic architecture of depression.
A schematic overview of the design of the study is shown in eFigure 1 in the Supplement. All protocols and questionnaires were approved by the QIMR Berghofer Medical Research Institute Human Research Ethics Committee. Data analysis for this study was conducted from September 10, 2020, to January 27, 2021. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline.
The AGDS is a large ongoing study of the causes of depression and treatment response. The recruitment and sample characteristics of the AGDS have been described in detail elsewhere.5 This present study uses data from the first data freeze in September 2018. Between 2016 and 2018, 20 689 participants (age, 28-58 years; 75% women) provided online consent and enrolled in the study. Participants completed a compulsory module that included the Composite International Diagnostic Interview Short Form13 to assess diagnostic criteria for depression. The compulsory module also assessed psychiatric comorbidities. Before September 2018, a total of 15 792 participants had provided a saliva sample (GeneFix; Isohelix saliva kit).
We evaluated the association between depression PRSs and a number of clinical features of depression in individuals meeting DSM-5 criteria for MDD. These features included early age at onset (defined as reported age at first episode of depression <21 years), reporting more than 2 episodes of depression, childhood trauma (defined as having experienced sexual, physical, or emotional abuse before age 18 years), and a self-reported diagnosis of an anxiety disorder, bipolar disorder, migraine, chronic fatigue, or chronic pain. Furthermore, we investigated the self-reported current measures of psychological distress and somatic symptoms determined using the PSYCH and SOMA subscales, respectively, of the SPHERE-12.14 The sample sizes for each of the phenotypes are shown in eTable 1 in the Supplement.
The QSkin sun and health study is a prospective cohort study initiated in 2011 primarily to examine skin cancer outcomes. Participants aged 40 to 70 years responded to a mailing to residents of Queensland, Australia, selected at random from the electoral role (n = 43 794). A total of 17 218 QSkin participants provided a saliva sample in 2014; answered the lifestyle questionnaire, which included a disease checklist comprising questions about ever having been diagnosed with psychiatric disorders; and provided consent for their data to be used for future research. Participants of European ancestry who reported not having been given a diagnosis of any psychiatric disorder were selected as controls for the case-control analysis. Those who reported a diagnosis of depression were included in the case cohort.
We evaluated the association of PRSs from summary statistics derived from 9 different genome-wide association studies (GWASs) of depression (Table).4,6,11,15 First, we used the results of the most recent published analysis of the Psychiatric Genomics Consortium Major Depression Working Group (PGC 2019),6 to our knowledge, the largest published study of summary statistics available, with the Australian samples (listed as the QIMR cohort) removed to ensure there was no chance of sample overlap. The PGC 2019 study is a meta-analysis including clinical cohorts, population registers, data from 23andMe, and broadly defined depression in the UK Biobank. 23andMe participants provided informed consent and participated in the research online, under a protocol approved by the external Association for the Accreditation of Human Research Protection Programs–accredited institutional review board. Second, we used the published results from a GWAS of broad depression in the UK Biobank that includes individuals with depression defined by answering yes to having sought help for nerves, anxiety, tension, or depression or a diagnosis of depression using linked hospital records. Third, we used summary statistics from the cohorts with clinically defined MDD in the PGC2019 study. These groups are described as the PGC29 cohorts by Wray et al.4 The summary statistics do not include the QIMR cohorts, but for consistency with previous studies, we refer to this discovery sample as PGC29. The remaining 6 phenotypes and their corresponding downsampled results are from the study of Cai et al,11 who conducted GWASs using 6 different definitions of depression in the UK Biobank. These definitions were based on measures including responses to single questions regarding help seeking, depression diagnoses obtained from linked health records, and MDD defined using DSM-5 criteria. Because different definitions produce widely varying numbers of cases and controls, which will affect power, we further evaluated the performance of PRSs derived from the 6 definitions of depression in the UK Biobank when each definition is downsized to give equal numbers of cases and controls between definitions (7500 cases and 42 500 controls) using the summary statistics provided by Cai et al.11 Because the PGC 2019, broadly defined depression in the UK Biobank, and PGC29 studies include depression diagnoses defined in multiple ways rather than a single strict definition, downsampling was not performed.
Details of the genotyping and quality control are provided in the eMethods in the Supplement. SBayesR,16 a bayesian method that assumes that single-nucleotide variant (SNV) effects are drawn from a mixture of four 0-mean normal distributions with different variances, was used to generate the weights for the PRSs. This method rescales the GWAS SNV effects with many SNVs assumed to have an effect size of 0. Full details are provided in the eMethods in the Supplement. The posterior SNV effects estimated by SBayesR were used to generate PRSs for each individual using the score function in PLINK.
Polygenic risk scores were standardized to calculate the effect size per SD unit of PRS. We also used linkage disequilibrium score regression17 to calculate the SNV-based heritability for clinical and self-reported depression and the genetic correlation with depression phenotypes from the UK Biobank.
In addition to evaluating the association with clinical depression in AGDS and self-reported depression in QSkin, we examined the association between depression PRSs and a number of clinical features of depression in individuals meeting MDD criteria in the AGDS. These features included early age at onset (defined as reported age at first episode of depression <21 years), reporting more than 2 episodes of depression, childhood trauma (defined as having experienced sexual, physical, or emotional abuse before age 18, assessed using part A of the PTSD Checklist for DSM-518), and a self-reported diagnosis of an anxiety disorder, bipolar disorder, migraine, chronic fatigue, or chronic pain. Furthermore, we investigated the self-report current measures of psychological distress and somatic symptoms measured using the PSYCH and SOMA subscales of the SPHERE-12.14 The sample sizes for each of the phenotypes are reported in eTable 1 in the Supplement.
Each of the full and downsampled PRSs was regressed against the clinical phenotypes of interest using logistic regression for binary variables and linear regression for continuous variables. Continuous variables were standardized before the regression. All analyses included age at enrollment, sex, and 10 genetic principal components as covariates.
A total of 12 106 (75% female; mean age, 42.3 years; range, 18-88 years) participants of European ancestry who met DSM-5 criteria for MDD were included. A further set of individuals (3083; 68% female) who self-reported a diagnosis of depression but for whom diagnostic criteria were not assessed was drawn from the QSkin study.19 Participants of European ancestry from QSkin who reported not having a diagnosis of any psychiatric disorder were included as controls (12 621; 51% female; mean age, 60.9 years; range, 43-87 years). The SNV-based heritability on the liability scale when comparing individuals with MDD in the AGDS with controls was 0.16 (0.02) and comparing QSkin participants with self-reported depression with controls was 0.12 (0.06) (eTable 2 in the Supplement).
We evaluated the association of each PRS with case status in the AGDS and QSkin. Regardless of whether the target sample included participants assessed for lifetime MDD (Figure 1A; eTable 3 in the Supplement) or a self-report diagnosis of depression (Figure 1C), the larger the sample size of the GWAS discovery, the larger the effect size of the PRS in the target sample, with the largest study (PGC2019) having the largest effect size with the odds ratio for MDD (1.75; 95% CI, 1.73-1.77) per SD of PRS and the PRS derived from International Statistical Classification of Diseases, 10th Edition (ICD-10) codes documented in hospitalization records in a population health cohort having the lowest odds ratio (1.14; 95% CI, 1.12-1.16). For all PRSs, the effect size was larger in the individuals with lifetime MDD, indicating that patients meeting clinical criteria in AGDS have a higher mean depression PRS than those who report having a depression diagnosis in the QSkin community sample. Given equal sample sizes, the lifetime MDD PRS had associations with higher ORs with lifetime MDD (OR, 1.20; 95% CI, 1.16-1.24) than the other definitions, such as PsyPsy (OR, 1.12; 95% CI, 1.08-1.15) (Figure 1C; eTable 3 in the Supplement). This association was not found when evaluating self-reported depression in QSkin, in which diagnostic criteria were not assessed (Figure 1D). Consistent with these results, the estimated genetic correlation with lifetime MDD in the UK Biobank when including patients with clinically defined MDD was higher (genetic correlation, 0.92; SE, 0.11), compared with when including self-reported cases (genetic correlation, 0.78; SE, 0.25), although this difference was not statistically significant (eTable 2 in the Supplement). Similarly, despite a larger sample size than the downsampled GWASs, the PRS derived from individuals with clinically defined MDD in PGC29 was not significantly more associated with self-reported depression in QSkin (Figure 1D). To investigate whether selecting screening for all psychiatric disorders in controls affected the results, we repeated the analysis with controls who reported not being diagnosed with depression only (n = 13 696). The increased association of the MDD-PRS in the individuals meeting MDD criteria remained (eFigure 2 in the Supplement).
We further sought to evaluate whether there are other clinical features of depression that are better captured by the clinically defined PRS. The results are shown in eFigure 3 and eTable 4 in the Supplement. Across all clinical measures examined, the PRSs from the largest PGC meta-analysis had the largest effect size. Likewise, when considering the different definitions of depression in the UK Biobank, the broad definition that encompasses multiple definitions and self-reports of seeing a physician for nerves, anxiety, tension, or worry, which has the largest sample size, generally gives the best estimation. By contrast, there are a number of notable features of the lifetime MDD PRSs. First, consistent with it better capturing the genetic risk for depression that is not shared with other major psychiatric disorders, the lifetime MDD PRS was not significantly higher in those reporting a comorbid anxiety disorder (OR, 1.02; 95% CI, 0.98-1.06; P = .31) or comorbid bipolar (OR, 1.01; 95% CI, 0.95-1.09; P = .80). In comparison, multiple other definitions, including both the ICD-10 codes from electronic records, and self-reports of seeing a physician for nerves, anxiety, tension, or worry in the UK Biobank, PRSs were significantly increased in those with comorbidities. Second, when accounting for differences in sample size, the lifetime MDD PRS had associations with higher ORs with reporting childhood trauma (OR, 1.14; 95% CI, 1.09-1.19) vs the association with the next highest OR, PRS (OR, 1.07; 95% CI, 1.02-1.12) from other definitions. Third, when accounting for sample size, the lifetime MDD and ICD-10 PRSs are better than other definitions at estimating current levels of somatic distress (eFigure 3 and eTable 4 in the Supplement).
Given the high prevalence of somatic symptoms reported by patients with more severe depressive disorders,20 we hypothesized that genetic analyses based on clinical definitions of depression better capture risk of somatic symptoms of depression than do definitions based on a single question or multiple screening questions, particularly when that question focuses on mood or psychological distress alone. We next investigated the association between the PRSs and current levels of mental and physical health measured on a scale from 1 (very poor) to 5 (excellent). Both the lifetime MDD PRS (β = −0.01 [0.009]; P = .29) and PGC29 PRS (β = −0.004 [0.01]; P = .67) showed no evidence of association with current mental health but show evidence of association with physical health (Lifetime MDD, β = −0.041 [0.009]; P = 6.05 × 10−06; PGC29, β = −0.023 [0.009]; P = .01). When considering equal sample sizes, the lifetime MDD PRS has the highest effect size with physical health (β = −0.035 [0.009]; P = 7.6 × 10−05) than other definitions, with the next largest being the ICD-10–based PRS (β = −0.026 [0.009]; P = .003) (eFigure 4 and eTable 5 in the Supplement).
In addition, we investigated which PRSs are associated with reporting common physical comorbidities of depression. When discovery GWAS sample sizes are equal, the lifetime MDD PRS had associations with higher ORs with migraine (OR, 1.08; 95% CI, 1.04-1.12; P = 4.76 × 10−05) than the association with the next highest ORs, PRS (GPPsy; OR, 1.02; 95% CI, 0.98-1.07; P = .36). Similarly, the lifetime MDD PRS had associations with higher ORs with chronic fatigue syndrome (OR, 1.13; 95% CI, 1.05-1.22; P = 7.11 × 10−04) with the next most associated PRS derived from ICD-10 codes (OR, 1.04; 95% CI, 0.97-1.11; P = .34). The lifetime MDD PRS was also associated with chronic pain (OR, 1.07; 95% CI, 1.01-1.13; P = .02); however, the results from other PRSs were comparable (ICD-10 PRS, OR, 1.06; 95% CI, 1.00-1.12; P = .04). (Figure 2; eTable 6 in the Supplement). This pattern of results suggests that selecting individuals with depression and controls by screening for diagnostic criteria for MDD gives a genetic risk score with associations with higher ORs with physiologic perturbations and phenotypes characterized by somatic symptoms, than other definitions of depression. However, the PGC29 PRS, which has only clinically defined cases, was associated only with comorbid migraine (OR, 1.06; 95% CI, 1.02-1.11; P = .005).
We evaluated the association of PRSs generated from different discovery samples of depression with depression in individuals meeting clinical criteria and self-reported depression. We found that estimation in the target samples was proportional to the sample size of the discovery GWAS, despite the larger GWASs depending on minimal phenotyped cases. Consistent with the findings of Cai et al,11 we found that when sample sizes of the discovery GWAS were equal, the clinical MDD PRS appeared to be a better variable associated with patients with MDD, but not in patients who self-report a diagnosis of depression without being assessed using the MDD criteria. This finding supports the conjecture of Cai et al that GWASs including only patients and controls screened for diagnostic criteria capture a genetic component of risk specific to MDD.
Analyses of clinical phenotypes of MDD showed that when sample sizes are equal, the lifetime MDD PRS is also associated with poor physical health, higher rates of somatic symptoms, and having comorbid migraine or chronic fatigue syndrome. A similar pattern was seen for the PRSs generated using ICD-10 codes for depression from electronic health records, although not as stark as for the MDD PRS (eFigure 3 in the Supplement). In contrast, PRSs derived from analyses using minimal phenotyping are not significantly less useful at estimating measures of severity, such as age at onset and number of episodes. The PGC29 PRS also showed evidence of association with somatic symptoms, but with lower effect sizes than the lifetime MDD PRS. Although patients in the PGC29 discovery GWAS met clinical criteria, there were differences in the ascertainment of patients across the cohorts in the PGC studies and, perhaps more importantly, differences in the screening of controls, with some cohorts using unscreened controls, which may have affected the results.
Somatic symptoms are common in patients with MDD and include fatigue, headaches, and back pain,21 and previous studies have found that a large proportion of patients meeting the criteria for depression present initially to primary care clinicians with somatic symptoms.22 Painful somatic symptoms are associated with increased functional impairment20 and poorer outcomes in patients with depression.21,23
Our results have important implications. If genetic information will have utility in estimating who in the population is at risk of having MDD, then increasing sample size of the discovery sample for GWASs of depression, regardless of the depth of phenotyping, should remain a high priority. If we seek to understand more completely the neurobiological underpinnings of more clinical forms of MDD, then as postulated by Cai and colleagues,11 minimal phenotyping will not capture all of the genetic risk for depression. However, even if studies that do not assess diagnostic criteria for MDD capture genetic risk that is nonspecific, the identified genetic risk factors contribute to the severity of depression as measured by earlier age at onset and chronicity and is therefore of major importance to elucidating the cause of depression.
Another key implication is investigating gene by environment interactions with childhood trauma using PRSs. Although the PRSs from all of the definitions of depression are enriched in individuals reporting trauma (eFigure 3 in the Supplement), the lifetime MDD has the largest effect size. Thus, the phenotype definition in both the discovery and target samples may affect the results of PRSs by trauma analyses, and screening for diagnostic criteria in patients and controls will be informative for untangling the association between genetic and environmental risks for depression. The increased PRSs in individuals reporting trauma is consistent with the findings of Coleman et al,24 who showed that the SNV-based heritability was higher in patients with MDD who reported childhood trauma in the UK Biobank. The association with high ORs with the MDD PRS may reflect the phenotypic association of trauma with MDD compared with other definitions as described by Cai et al.11 The phenotypic association of trauma with MDD could induce gene-environment associations influenced by differences in socioeconomic status,25 which would manifest in the discovery GWASs as genetic effects.25,26 Within-family analyses will be valuable for further investigating differences in polygenic risk between patients with and without exposure to trauma.27
The study had limitations. Both the UK Biobank discovery sample and the AGDS rely on structured diagnostic questionnaires to assess criteria for MDD. Although these instruments have been found to have good validity, an interview with a trained clinician remains the standard in diagnosing MDD, and the results of this study should be viewed with caution. Likewise, the UK Biobank, AGDS, and QSkin are cohort studies that have not recruited participants in the clinical setting. They therefore may not be representative of the full clinical spectrum of MDD in the population. In addition, participants in the discovery and target studies are mostly of British and Irish ancestries, and these results may not generalize to other ancestral groups both within and outside Europe.
Results of this case-control study suggest that increasing sample sizes by including patients defined in numerous ways is essential to enhancing our understanding of genetic risk for depression and generating more accurate PRSs for use in research and clinical settings. However, to see a complete picture of the biological characteristics of depression, large, well-phenotyped cohorts that are enriched for clinical depression are needed. The AGDS demonstrates that it is feasible to establish large genetically informative cohorts with in-depth online phenotyping that can provide meaningful insights into the cause of depression.
Accepted for Publication: June 7, 2021.
Published Online: August 11, 2021. doi:10.1001/jamapsychiatry.2021.1988
Corresponding Author: Enda M. Byrne, PhD, Program in Complex Trait Genomics, Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Rd, St Lucia, Brisbane, Australia (firstname.lastname@example.org).
Author Contributions: Dr Byrne had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Mitchell, Thorp, Nyholt, Hickie, Martin, Medland, Wray, Byrne.
Acquisition, analysis, or interpretation of data: Mitchell, Thorp, Wu, Campos, Nyholt, Gordon, Whiteman, Olsen, Martin, Medland, Wray, Byrne.
Drafting of the manuscript: Mitchell, Byrne.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Mitchell, Thorp, Wu, Campos, Gordon, Medland, Wray, Byrne.
Obtained funding: Whiteman, Olsen, Hickie, Martin, Wray, Byrne.
Administrative, technical, or material support: Mitchell, Campos, Nyholt, Whiteman, Olsen, Hickie, Medland, Wray.
Supervision: Martin, Wray.
Conflict of Interest Disclosures: Dr Campos reported receiving grants from The University of Queensland during the conduct of the study. Dr Whiteman reported receiving grants from the National Health and Medical Research Council (NHMRC) of Australia Fellowship for salary and competitive grants to support data collection and analysis during the conduct of the study and speaker’s fees from Pierre Fabre for conference presentation outside the submitted work. Dr Olsen reported receiving grants from the NHMRC of Australia during the conduct of the study. Dr Hickie was an inaugural commissioner on Australia's National Mental Health Commission (2012-2018) and is the codirector, Health and Policy at the Brain and Mind Centre (BMC) University of Sydney, Australia. The BMC operates an early-intervention youth services at Camperdown under contract to headspace. Professor Dr Hickie had previously led community-based and pharmaceutical industry-supported (Wyeth, Eli Lilly, Servier, Pfizer, AstraZeneca) projects focused on the identification and better management of anxiety and depression. He was a member of the Medical Advisory Panel for Medibank Private until October 2017, a board member of Psychosis Australia Trust, and a member of Veterans Mental Health Clinical Reference group. He is the chief scientific advisor to and a 5% equity shareholder in InnoWell Pty Ltd, which was formed by the University of Sydney (45% equity) and PwC (Australia; 45% equity) to deliver the $30 million Australian government-funded Project Synergy (2017-2020, a 3-year program for the transformation of mental health services), and to lead transformation of mental health services internationally through the use of innovative technologies. No other disclosures were reported.
Funding/Support: The Australian Genetics of Depression Study was primarily funded by grant 1086683 from the NHMRC of Australia. This work was further supported by NHMRC grants 1145645, 1078901, and 108788, and by National Institutes of Health grant 1R01MH121545-01.
Role of the Funder/Sponsor: The funding organizations had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Additional Contributions: We are indebted to all of the participants for giving their time to contribute to this study. We thank all the people who helped in the conception, implementation, beta testing, media campaign, and data cleaning. We are grateful to the Psychiatric Genomics Consortium Major Depressive Disorder Working Committee for making summary statistics available for research. We are grateful to all of the principal investigators, researchers, and participants of the cohorts in the Psychiatric Genomics Consortium. We also thank the research participants and employees of 23andMe for making this work possible.
Additional Information: The full GWAS summary statistics for the 23andMe discovery data set will be made available through 23andMe to qualified researchers under an agreement with 23andMe that protects the privacy of the 23andMe participants. More information and application to access the data are available at https://research.23andme.com/collaborate/#dataset-access/.