Genetic and Environmental Risk Factors Associated With Trajectories of Depression Symptoms From Adolescence to Young Adulthood

Key Points Question Are genetic and environmental risk factors associated with different trajectories of depression symptoms during adolescence and young adulthood? Findings In a cohort study of 3525 individuals observed from ages 10 to 24 years, both genetic and environmental risk factors were associated with childhood-persistent and early-adult–onset trajectories of depression symptoms, while adolescent-limited and childhood-limited trajectories were not associated with genetic risk factors. Meaning Differential patterns of timing and the nature of genetic and environmental risk factors were associated with different trajectory groups for depression symptoms, which could help to guide the timing and focus of prevention strategies.


Biological sex
Biological sex was identified from birth notifications taken around the time of delivery and was coded as male or female. Data for 4 people is missing thus reducing our maximum analytical sample to 9394 (even though individuals with at least one measurement of depressive symptoms were 9398).

Polygenic risk score for depressive symptoms (PRS)
Participants were genotyped using the Illumina HumanHap550 quad chip. Individuals were excluded based on gender mismatches, minimal or excessive heterozygosity, disproportionate levels of individual missingness (>3%), evidence of cryptic relatedness (>10% of alleles identical by descent), insufficient sample replication (IBD < 0.8) and being of non-European ancestry (assessed by multidimensional scaling analysis including HapMap 2 individuals). Thus, our analysis is only on individuals of European descent. SNPs with a minor allele frequency (MAF) of < 1%, Impute2 information quality metric of < 0.8, a call rate of < 95% or evidence for violations of Hardy-Weinberg equilibrium (P-value < 5e-7) were removed. Imputation performed using Impute v2.2.2 with the 1000 genomes reference panel (Phase 1, Version 3), using 2186 reference haplotypes. The maximum number of single nucleotide polymorphisms (snps) that were imputed (and passed filtering on MAF of > 1% and info score > 80%) was 8282911. In the case of siblings, one individual was dropped from analysis in order not to inflate the genetic effect, thus all results are based upon singletons.
The PRS for depressive symptoms was created in PRSice, 1 using summary statistics from a recent genome wide association study (GWAS) of depressive symptoms on 161,460 individuals. 2 We included snps that had a MAF of > 1% and info score > 80%) and excluded SNPs with an R 2 of >0.1, which were within 250Kb of each other. We excluded snps located in the extended MHC region (chromosome 6 (26-33Mb)). Polygenic risk scores were created at various p-value thresholds (between 5x10 -8 and 0.5) and we used the most liberal threshold (0.5) for prediction based upon recent evidence that more liberal polygenic scores may be better predictors if the scores are only concerned with maximising prediction. 3,4 Population stratification can be a problem in analysis utilising polygenic risk scores, thus, to account for this we adjusted our analysis for the first five principal components of ancestry, as per previous studies. 3

Childhood bullying
Childhood bullying was measured using the modified Bullying and Friendship Interview Schedule. 5 A child was classed as an overt victim, if he/she was on the receiving end of any of the following five components of overt bullying frequently (several times a month) or very frequently (several times a week): 1. Had personal belongings taken 2. Been threatened/blackmailed 3. Been hit/beaten up 4. Been tricked in a nasty way 5. Been called bad/nasty names Children who responded with seldom or never to having been bullied for each of the four questions were categorised as not being victims. In addition, children for whom no more than two questions were missing with the remaining items being seldom/never were classed as NOT being bullied.

Growth Mixture Modelling
Trajectories of depressive symptoms were estimated with intercept, slope and quadratic growth factors for each class. Previous work has suggested that quadratic growth may be useful when modelling the non-linearity of trajectories of depressive symptoms. [6][7][8] To reduce convergence issues, the growth factor variances were constrained to be equal across trajectories. 9 We built a stepwise model starting with a single trajectory (k-class) and continued to add trajectories into the model (k+1) until the optimal number of trajectories was reached. To assess this optimum, we used a range of criteria including: lowest sample-sizeadjusted Bayesian Information Criterion (ssaBIC; which based on the log-likelihood penalised for model complexity as captured by the number of parameters 10 ), and hypothesis testing to help model choice by comparing model fit for k versus k-1 trajectories, using the adjusted likelihood ratio test (LRT) proposed by Lo-Mendell-Rubin. 11 After determining the optimal number of trajectories, covariates (risk factors) were added into the model to examine risk factors for varying trajectory membership. We then compared whether the inclusion of these risk factors affected the overall shape and sample distribution of these trajectories. During this analysis, a bias-adjusted 3-step approach was used, which takes into account the uncertainty in the classification of participants into each trajectory  Sex was coded as 0 for males and 1 for females. The PRS was standardised to have a mean of 0 and a SD of 1. Postnatal depression, cruelty to mother and bullied at age 10 were coded as 0 for no and 1 for yes. Anxiety was coded between 0-12, with greater scores corresponding to worse childhood anxiety. a Analysis was conducted using Pearson's correlations. b Analysis was conducted using Tetrachoric correlations. c Analysis was conducted using Point-Biserial correlations and verified using Pearson's correlations.