A, Venn diagram representing different steps of microarray selection by quality control and verification of genetic information. The blue circle represents an array without good quality control. The green circle represents individuals having discordance between pedigree information and array (sex and relationship). The red circle represents duplicated microarrays. B, Flowchart representing the steps of CNVs selection after that microarray with good quality was selected. kb indicates kilobase; seg dup, segmental duplication; and SYS, Saguenay Youth Study.
A, Distribution of the loss of PIQ per gene estimated by the model 4 for all RefSeq genes (N = 17 102). The black solid vertical line is the median, the dashed black vertical line is the mean, and the dashed orange vertical line is the mean individual effect of genes included in rare deletions (model 2). Of note, 36% of genes have an associated estimated loss of PIQ less than −0.67, 33% (n = 5597) are predicted to affect IQ greater than or equal to 1 point, 23% (n = 3949) by greater than or equal to 2 points, and 6% (n = 968) have maximum effect size. B, Concordance between loss of PIQ estimated by model 4 (y-axis) and loss of PIQ measured by previously published studies (x-axis) for 15 recurrent CNVs. Each point corresponds to a known recurrent CNV: (1) 17p12_(HNPP), (2) 16p12.1, (3) 15q11.2, (4) 16p13.11, (5) 1q21.1 TAR, (6) 17q12, (7) 16p11.2 Distal (SH2B1), (8) 1q21.1 Distal (Class I), (9) 15q13.3 (BP4-BP5), (10) 16p11.2 proximal (BP4-BP5), (11) 22q11.2, (12) 7q11.23 (William-Beuren), (13) 3q29 (DLG1), (14) 8p23.1, and (15) 17p11.2 (Smith-Magenis). The diagonal dashed line represents exact concordance. When loss of IQ was not directly measured in a previous study, we derived the loss of IQ from the published OR measuring the enrichment of a CNV in the neurodevelopmental clinic (open circles). Of note, the PIQ loss estimated by the model is 4.4 points for the 17q21.31 deletion, including KANSL1 a causal gene for intellectual disabilities. This is discordant with the estimated decrease of 36.3 points based on empirical data from the literature. Including this CNV, the concordance is 0.62 (95% CI, 0.20-0.85). C, Coverage of genes by the Center Hospitalier Universitaire Sainte-Justine (CHU-SJ) cohort according to significance level of CNVs. The y-axis represents the number of unique genes observed in the cohort, and the x-axis represents the number of individuals seen in the cohort. Coverage was obtained using 1000 iterations (bootstrap procedure on the order of individuals’ inclusion). D, Density of loss of PIQ estimated by the model 4 for genes included in CNVs from CHU-SJ for the different clinical significance category of CNVs. Solid lines correspond to medians, and the dashed lines correspond to means. ICC3,1 indicates intraclass correlation coefficient (1, 3); VUS, variant of unknown significance.
A, Probability of de novo estimated by model 5 (y-axis) according to the loss of performance IQ (PIQ) estimated by model 4 (x-axis) for the 897 deletions from the Center Hospitalier Universitaire Sainte-Justine cohort and 1264 deletions from the Simon simplex collection cohort for which inheritance status is available (the orange lines are all copy number variants [CNVs], and the gray lines are nonrecurrent CNVs). The de novo frequency increased even for modest associations with IQ (eg, −5 points of PIQ is associated with a de novo frequency of 7.8%). B, Concordance between de novo frequency observed in DECIPHER (x-axis) and the probability of being de novo estimated by model 5 (y-axis) for 15 recurrent CNVs. Each point corresponds to a known recurrent CNV: (1) 17p12_(HNPP), (2) 16p12.1, (3) 15q11.2, (4) 16p13.11, (5) 1q21.1 TAR, (6) 17q12, (7) 16p11.2 Distal (SH2B1), (8) 1q21.1 Distal (Class I), (9) 15q13.3 (BP4-BP5), (10) 16p11.2 proximal (BP4-BP5), (11) 22q11.2, (12) 7q11.23 (William-Beuren), (13) 3q29 (DLG1), (14) 8p23.1, and (15) 17p11.2 (Smith-Magenis). The first bisector represents the perfect concordance. ICC3,1 indicates intraclass correlation coefficient (1, 3).
eTable 1. Difference of IQ Between Male and Female in IMAGEN and SYS Cohorts
eTable 2. Breakpoints Used to Detect Recurrent CNVs
eTable 3. Empirical Data on Recurrent CNVs
eTable 4. Frequency of Rare Autosomal CNVs in IMAGEN and SYS Cohorts and Comparison With Mannik et al’s Cohort
eTable 5. Frequency of Rare X-linked CNVs in IMAGEN and SYS Cohorts
eTable 6. Comparison of Proportion of Rare Autosomal Deletions ≥250 kb, Duplications ≥1 Mb, and Recurrent CNVs Between Males and Females
eTable 7. Impact on IQ of Rare CNVs by Category of Size and Recurrent CNVs
eTable 8. Average of AIC and BIC for Models 1-3 Obtained by Bootstrap on Pooled Cohort
eTable 9. The Cumulative Effect on IQ of CNV Size, Number of Complete Genes, and Number of Exon (Models 1-3) on the Subset of Individuals With European Ancestry
eTable 10. Estimates of Model 4 in IMAGEN and SYS Cohorts
eTable 11. Results of Variables Selection, Performing on the Subset of 1713 Individuals From The Pooled Cohort Carrying at Least One Deletion, Using Bootstrap Stepwise Procedure Based on BIC in Multiple Linear Regression Where Dependent Variable Is Either the PIQ or VIQ and Candidate Predictors Are Deletion Contents
eTable 12. Results of PCA Analysis on Gene-Related Variables, Including the 1713 Individuals of the Pooled Cohort Having at Least One Deletion.
eTable 13. Results of Variables Selection, Performing on the Subset of 1713 Individuals From The Pooled Cohort Carrying at Least One Deletion, Using Bootstrap Stepwise Procedure Based on BIC in Multiple Linear Regression Where Dependent Variable Is Either the PIQ or VIQ and Candidate Predictors Are Deletion Contents After PCA for Related to Gene Variables
eTable 14. Results of Sensitivity Analyses for Models 1-3
eTable 15. Results of Sensitivity Analyses for Models 1-3 in the Subset of Adolescent With European Ancestry
eTable 16. Estimates and Sensitivity Analysis for Model 4
eTable 17. Table of Impact of IQ in CHU-SJ Cohorts by CNVs Classification
eTable 18. De Novo CNVs in SYS
eTable 19. De Novo in General Population
eMethods. Supplemental Methods
eFigure 1. Representation of Ancestries Information for IMAGEN and SYS
eFigure 2. Definition of Shift of IQ and Threshold From Which a Subject Is Referred in Clinic
eFigure 3. Simulated Odds Ratios Based on Shift of IQ and Threshold From Which an Individual is Referred to Clinic
eFigure 4. Overlap Between Rare Recurrent and Nonrecurrent CNVs in IMAGEN and SYS Children
eFigure 5. Predicted Loss of VIQ for Recurrent CNVs and Nonrecurrent CNVs Observed in the Clinic
eFigure 6. Estimation of the Loss of PIQ for Each Gene Included in 16 Recurrent Deletions Known to Be Associated With Psychiatric Disorders
eFigure 7. Graph of Correlations Between Variables Included in Selection Procedure for Individuals Having at Least One Deletion (N=1713)
eFigure 8. Log R Ratio and B Allele Frequency for Children Having De Novo and Parents From SYS Cohort
eFigure 9. Effect on IQ of De Novo in SYS Cohort
eResults. Supplemental Results
Customize your JAMA Network experience by selecting one or more topics from the list below.
Huguet G, Schramm C, Douard E, et al. Measuring and Estimating the Effect Sizes of Copy Number Variants on General Intelligence in Community-Based Samples. JAMA Psychiatry. 2018;75(5):447–457. doi:10.1001/jamapsychiatry.2018.0039
Can we measure and estimate the effect sizes of recurrent and rare nonrecurrent pathogenic copy number variants on IQ?
The haploinsufficiency scores best explain the effect size of deletions on IQ measured in 1713 deletion carriers from 2 general population cohorts. IQ is affected by 2.74 points per deleted unit of the probability of being loss-of-function intolerant, and models estimate the effect size of deletions on IQ with a concordance of 0.75.
Effect sizes on IQ of most deletions can be reliably estimated by models using haploinsufficiency scores, and the effect sizes of haploinsufficiency are broadly distributed across the genome.
Copy number variants (CNVs) classified as pathogenic are identified in 10% to 15% of patients referred for neurodevelopmental disorders. However, their effect sizes on cognitive traits measured as a continuum remain mostly unknown because most of them are too rare to be studied individually using association studies.
To measure and estimate the effect sizes of recurrent and nonrecurrent CNVs on IQ.
Design, Setting, and Participants
This study identified all CNVs that were 50 kilobases (kb) or larger in 2 general population cohorts (the IMAGEN project and the Saguenay Youth Study) with measures of IQ. Linear regressions, including functional annotations of genes included in CNVs, were used to identify features to explain their association with IQ. Validation was performed using intraclass correlation that compared IQ estimated by the model with empirical data.
Main Outcomes and Measures
Performance IQ (PIQ), verbal IQ (VIQ), and frequency of de novo CNV events.
The study included 2090 European adolescents from the IMAGEN study and 1983 children and parents from the Saguenay Youth Study. Of these, genotyping was performed on 1804 individuals from IMAGEN and 977 adolescents, 445 mothers, and 448 fathers (484 families) from the Saguenay Youth Study. We observed 4928 autosomal CNVs larger than 50 kb across both cohorts. For rare deletions, size, number of genes, and exons affect IQ, and each deleted gene is associated with a mean (SE) decrease in PIQ of 0.67 (0.19) points (P = 6 × 10−4); this is not so for rare duplications and frequent CNVs. Among 10 functional annotations, haploinsufficiency scores best explain the association of any deletions with PIQ with a mean (SE) decrease of 2.74 (0.68) points per unit of the probability of being loss-of-function intolerant (P = 8 × 10−5). Results are consistent across cohorts and unaffected by sensitivity analyses removing pathogenic CNVs. There is a 0.75 concordance (95% CI, 0.39-0.91) between the effect size on IQ estimated by our model and IQ loss calculated in previous studies of 15 recurrent CNVs. There is a close association between effect size on IQ and the frequency at which deletions occur de novo (odds ratio, 0.86; 95% CI, 0.84-0.87; P = 2.7 × 10−88). There is a 0.76 concordance (95% CI, 0.41-0.91) between de novo frequency estimated by the model and calculated using data from the DECIPHER database.
Conclusions and Relevance
Models trained on nonpathogenic deletions in the general population reliably estimate the effect size of pathogenic deletions and suggest omnigenic associations of haploinsufficiency with IQ. This represents a new framework to study variants too rare to perform individual association studies and can help estimate the cognitive effect of undocumented deletions in the neurodevelopmental clinic.
Copy number variants (CNVs) contribute to a spectrum of neurodevelopmental disorders (NDDs) and psychiatric disorders, including intellectual disabilities (IDs), autism spectrum disorders, and schizophrenia.1-9 With the routine implementation of whole genome chromosomal microarrays in medical diagnostics, pathogenic CNVs (as defined by the American College of Medical Genetics10) are identified in 10% to 15% of children referred for NDDs.11 Copy number variants may arise recurrently by nonhomologous recombination in unrelated individuals. Several recurrent CNVs have been individually associated with IDs,7,8 autism spectrum disorders,1-3,12 and schizophrenia.5,6 Beyond association with a psychiatric diagnosis, little is known about the effect size of CNVs on cognitive traits. A study performed13 in the general population of Iceland found that 26 psychiatric CNVs reduce, in aggregate, IQ by 15 points or 1 SD. With the use of the cognitive tests available in the UK Biobank, 54 loci were associated with decreased scores, ranging from 0.1 to 0.5 SD.14
However, most pathogenic CNVs reported back to patients are undocumented because they are ultrarare or even private to the patient or family.11,15 They cannot be investigated using individual association studies. Their associations with cognition and mechanisms by which they lead to neurodevelopmental symptoms remain unknown. These nonrecurrent CNVs have been studied in aggregate by size categories in a general population sample of 6819 individuals from Estonia.16 In this cohort, rare, large, and intermediate (>250 kilobases [kb]) deletions and large, rare duplications (>1 megabase [Mb]) were found in 10% of the population. In aggregate, these CNVs were associated with IDs and adversely affected educational achievement; cognitive measures were unavailable.
The aim of this study was to calibrate and validate models to measure and estimate effect sizes of nonrecurrent pathogenic CNVs on general intelligence measured by IQ. To achieve this, we estimated effect sizes of rare recurrent and nonrecurrent CNVs on IQ using 2 general population cohorts. We then scored CNVs using 10 functional annotations to identify variables that contribute the most to variation in IQ. This model, which was subsequently validated, will help clinicians and researchers estimate the association of pathogenic CNVs with IQ.
We used 2 cohorts recruited from the general population: IMAGEN,17 including 2090 adolescents from Europe, and the Saguenay Youth Study (SYS),18 including 1983 individuals (1032 children, 951 parents, 486 families) from Quebec, Canada. All children completed tests of verbal IQ (VIQ) and performance IQ (PIQ) using the Wechsler Intelligence Scale for Children, Fourth Edition19 (subset) for IMAGEN and Wechsler Intelligence Scale for Children, Third Edition20 for SYS. Distribution of IQ scores are available in eTable 1 in the Supplement. The IMAGEN project had obtained ethical approval by the local ethics committees and written informed consent from all participants and their legal guardians. For SYS, the institutional review boards of all participating institutions approved all studies reported herein. For SYS and IMAGEN, the parents and adolescents provided written informed consent and assent, respectively. All data were deidentified.
We used the chromosomal microarray database from the cytogenetic laboratory of the pediatric hospital of Center Hospitalier Universitaire Sainte-Justine (CHU-SJ; Montreal, Canada), including 16 586 individuals referred for NDDs, and the Simon simplex collection (SSC),12,21 including 2591 children with autism spectrum disorders and their family members.
Genotyping technologies are detailed in the eMethods in the Supplement. A total of 1804 individuals from IMAGEN and 977 adolescents, 445 mothers, and 448 fathers (484 families) from SYS (Figure 1A) met stringent quality control criteria (call rate ≥99%, log R ratio SD <0.35, B allele frequency SD <0.08, and wave factor <0.05). We computed relatedness separately in IMAGEN and SYS based on the identity by state using PLINK.22 The CNV detections from PennCNV23 and QuantiSNP24 were combined to minimize the number of potential false discoveries. We used standard filtering strategies detailed in the eMethods in the Supplement.
We annotated CNVs for size and number of genes using RefSeq genes (https://genome.ucsc.edu/), and genes were annotated using the probability of being loss-of-function intolerant (pLI),25 the residual variation intolerance score,26 the score rate for intolerance for deletions and duplications,27 the number of protein-protein interactions,28 and the differential stability score29 of regional patterns of gene expression in the brain. These 5 scores were transformed, and the score associated with a CNV is the sum of scores of genes with all isoforms fully contained in the CNV (complete genes) (eMethods in the Supplement). The CNVs were also annotated with 2 lists of genes, including postsynaptic density of the human cortex,30 genes regulated by the Fragile-X mental retardation protein,31 and the number of expression quantitative trait loci regulating genes expressed in the brain32 (eMethods and eTable 2 in the Supplement).
Only autosomal CNVs were analyzed, and 3 outliers were excluded (eMethods in the Supplement). P < .05 indicates statistical significance, and all tests were 2-sided.
We performed 3 multiple linear regressions to quantify the effect size of CNVs (model 1), number of genes (model 2), and number of exons (model 3) on PIQ and VIQ. For each model, the variable of interest was measured in 4 categories of CNVs according to frequency (rare or common) and type (deletion or duplication). Models 1 through 3 included adjustment for ancestry, sex, age, microarray technology, and intrafamilial relatedness (eMethods and eFigure 1 in Supplement).
We performed a stepwise variables selection procedure based on the Bayesian information criterion33 to investigate 10 variables that would best explain the association of deletions with PIQ and VIQ (eMethods in the Supplement). The best model is denoted as model 4 in the remainder of this article. Sensitivity analyses were performed for models 1 through 4 (eMethods in the Supplement).
We then examined whether model 4 could predict the association of IQ with 15 known recurrent CNVs by calculating the concordance between model prediction and empirically measured loss of IQ obtained from previous publications (eMethods, eFigures 2 and 3, and eTable 3 in the Supplement). The concordance was computed using the intraclass coefficient correlation (3,1) (ICC3,1).34
Using data on inheritance from the CHU-SJ and SSC cohorts, we performed a logistic regression model (model 5) to establish the association between the probability at which CNVs occur de novo and their association with IQ predicted by model 4. We computed the ICC3,1 to evaluate the concordance between the probability for a CNV to be de novo predicted by model 5 and de novo frequency for the same 15 recurrent CNVs using data from the DECIPHER database (http://decipher.sanger.ac.uk) (eTable 3 in the Supplement).
We observed 4928 autosomal CNVs larger than 50 kb across both cohorts (Figure 1B and Table 1). Rare CNVs of 250 kb or larger (n = 308) are mostly nonrecurrent (92.8%), and their frequencies, similar across both cohorts, are identical to a previously published study16 (eResults, eFigure 4, and eTables 4-6 in the Supplement). We examined variables recurrently associated with NDDs and psychiatric disorders,7,8,16 namely, CNV size (model 1), number of genes (model 2), and number of exons (model 3), and estimated their association with IQ for 4 CNV categories, namely, common and rare deletions and duplications. In all 3 models, only rare deletions had significant effects on IQ. The effect of size (model 1) can be illustrated by a decrease of PIQ (mean [SE], 5.7 [2.0] points; P = 6 × 10−4) and VIQ (mean [SE], 3.6 [2.0] points; P = .03) for each deleted Mb (Table 2). These results are concordant with comparisons between carriers and noncarriers of rare CNVs stratified by size (eTable 7 in the Supplement). In model 2, each gene deleted by a rare CNV decreases PIQ by a mean (SE) of 0.67 (0.19) points (P = 6 × 10−4) and VIQ by 0.72 (0.19) points (P = 2 × 10−4). In model 3, each exon deleted by a rare CNV decreases PIQ by a mean (SE) of 0.07 (0.02) points (P = 2 × 10−5) and VIQ by 0.06 (0.02) points. For models 1 through 3, effects are similar in both cohorts separately. We found no measurable associations of common deletions or duplications with IQ (Table 2). The distributions of Akaike information criterion and Bayesian information criterion, obtained by fitting the models 1 through 3 on 1000 bootstrap samples of the pooled data set each, show that gene and exon contents provide a better fit than size (eTable 8 in the Supplement). Applying models 1 through 3 on individuals with European ancestry shows similar results (eTable 9 in the Supplement).
To understand factors that potentially drive the associations of deletions with IQ, we investigated 10 functional annotations in the subset of 1713 individuals carrying at least 1 autosomal deletion in the pooled data set. The stepwise variable selection procedure converges on model 4, including pLI alone (PIQ: effect = −2.69, bias corrected effect = −2.74, SE = 0.68, P = 8 × 10−5; VIQ: effect = −2.41; bias corrected effect = −2.52; SE = 0.71; P = 7 × 10−4). The associations of pLI estimated in IMAGEN and SYS separately are the same, and no differences are observed between the association with PIQ and VIQ (eTable 10 in the Supplement). In the bootstrap procedure, pLI is the most frequently selected covariate for PIQ (37.8%) and the second most frequently selected covariate for VIQ (23.5%) behind the residual variation intolerance score (28.1%) and is always preferred to size or number of genes. Model 4 relies on pLI score, and the distribution of associations with PIQ of 17 102 individual genes shows that 33% of coding genes are predicted to affect PIQ by 1 point or more and 23% by 2 points or more. More than 968 genes (6%) have a maximum pLI of 1, with a corresponding effect size of −2.7 for PIQ and −2.5 for VIQ (Figure 2A and eFigure 5A in the Supplement), demonstrating that the model cannot estimate the association of 93 causal genes for IDs with very large or extreme associations with IQ35 (eTable 3 and eFigure 6 in the Supplement). The variable selection procedure, performed after a principal component analysis does not provide a better fit than model 4 (eMethods, eResults, eFigure 7, eTables 12 and 13 in the Supplement). Of note, there is no interaction between sex and any of the variables tested in models 1 through 4.
We examined whether a subgroup of CNVs biased or overly influenced the results. Sensitivity analyses show that effect sizes of rare deletions and pLI on IQ are unchanged even after removing carriers with CNVs of 1 Mb or greater as well as recurrent CNVs previously associated with psychiatric NDDs. Transformed variables did not improve any of the models (eTables 14-16 in the Supplement). Additional sensitivity test results are detailed in eResults in the Supplement.
We compared IQ loss predicted by the model to IQ loss empirically measured in previous studies20,21 of 15 known recurrent CNVs without causal genes for IDs (eTable 3 and eFigure 6 in the Supplement). The concordance is 0.75 for PIQ (95% CI, 0.39-0.91; P = 5 × 10−4) and 0.72 for VIQ (95% CI, 0.35-0.90, P = 8 × 10−4) (Figure 2B and eFigure 5B in the Supplement). Widths of CIs are correlated with the effect size of the CNV, reflecting that CNVs with high pLI are rarely observed and are different from the distribution of CNVs observed in our general population cohorts. Of note, these results are similar whether we include or exclude, from the training data set, the 3 recurrent CNVs observed in 6 individuals in the pooled cohort (16p11.2 proximal BP4-BP5: 1 adolescent from IMAGEN; 16p12.1: 1 adolescent from IMAGEN and 2 sisters from SYS; 16p13.11: 2 adolescents from IMAGEN).
The widespread but small effect size of haploinsufficiency implies that pathogenic deletions could be found throughout the genome if the aggregate haploinsufficiency score is high enough to affect IQ. This finding is consistent with the fact that more than one-third (7429) of the coding genome is deleted by 1217 pathogenic autosomal deletions reported back to patients by the CHU-SJ (Figure 2C). Of note, genes included in pathogenic variants (n = 6799) or variants of unknown significance deletions (n = 1396) have a higher pLI than those included in benign deletions (n = 928) (Wilcoxon P = 4.8 × 10−14 for genes included in pathogenic versus benign deletions and Wilcoxon P = 9.7 × 10−6 for genes included in variant of unknown significance vs benign deletions) (Figure 2D and eFigure 5C in the Supplement).
As an illustration, we estimated the association with IQ of the aforementioned 1217 pathogenic deletions: the top quartile (25% of CNVs) is estimated to decrease IQ by more than 28 points, whereas the 2 middle quartiles decrease IQ between 28 and 4 points (eTable 17 in the Supplement). Of note, estimates for the lower quartile are smaller than a 4-point decrease in IQ, but most of the latter CNVs cannot be estimated properly because they disrupt a causal gene with large effects.
In the neurodevelopmental clinic, de novo events are regarded as strong arguments in favor of pathogenicity, and CNV size was previously associated with de novo frequency.36 However, to our knowledge, the exact association between effect size on IQ and de novo frequency has not been studied. We examined inheritance of 2161 deletions 50 kb or larger from the CHU-SJ and SSC cohorts. The logistic regression model (model 5) suggests a tight association between effect size on IQ (estimated by model 4) and probability of being a de novo CNV (odds ratio, 0.86; 95% CI, 0.84-0.87; P = 2.7 × 10−88) (Figure 3A). Results are similar when recurrent CNVs are excluded. The concordance between the probability of occurring de novo estimated by model 5 and de novo frequency calculated using empirical data from the DECIPHER database on 15 recurrent deletions is 0.77 (95% CI, 0.43-0.91; P = 2.7 × 10−4) (https://decipher.sanger.ac.uk/) (Figure 3B). We also examined 1147 CNVs 100 kb or larger in 837 adolescents and their parents from the general population (SYS). Seventeen occurred de novo (1.5%; 6 deletions and 11 duplications), which is similar to frequencies previously reported in the general population (eFigure 8 and eTables 18 and 19 in the Supplement). Among the 6 de novo deletions, 3 have never been referenced in the database of genomic variants, suggesting class 1 de novo events37 (eFigure 9B, cases 1, 5, and 6, in the Supplement). Although model 4 predicts a large effect of −28.7 points for the deletion in case 5, the predicted effect for cases 1 and 6 is less than 5 points, suggesting that among class 1 de novo events37 effect sizes can be modest. The 3 other CNVs (cases 2, 3, and 4) have general population frequencies greater than 0.01%, suggesting class 2 de novo events37 consistent with the small predicted associations with IQ.
This study quantifies and predicts the effect size of deletions on IQ using data from the general population and clinical cohorts. Deletions are associated with a decrease in general intelligence, and our models suggest that the effect of haploinsufficiency can be reliably predicted for most pathogenic deletions. This approach provides a framework for studying the effect of CNVs that are too rare to study in individual association studies.
Our study suggests that haploinsufficiency of most of the coding genome potentially influences general intelligence, and one-third of the coding genome affects IQ by 0.67 points or more (mean effect of genes included in rare deletions). This finding is consistent with the omnigenic model37 of complex traits based on the observation that genome-wide association study association signals are spread across most of the genome, including variants near many genes without any obvious connection to disease.
This finding has important implications for the clinical interpretation and functional studies of CNVs. A dominant hypothesis that guides many studies is that a major gene(s) contributes to most of the neurodevelopmental symptoms observed in CNV carriers. Our study suggests an alternative hypothesis that the large effect size observed in pathogenic deletions may be polygenic in nature and attributable to the sum of small individual effects of each gene included in the deletion. This hypothesis could explain why causal genes or major drivers have been difficult to identify in most recurrent CNVs.38,39
Intriguingly, our model predicts reasonably the effect size of the Smith-Magenis deletion without attributing a large effect size to the RAI1 gene (OMIM 607642) on IQ. Although RAI1 causes most of the dysmorphic and disruptive behavioral features of Smith-Magenis syndrome,40 its association with IQ may be smaller than expected. Of note, a recent study41 did not identify an excess of de novo mutations in RAI1 in more than 7000 individuals with ID.
Large discordances between estimated and empirical estimates of IQ are of particular interest. For example, the model underestimates the effects of 15q13.3 and 3q29 deletions. This underestimation could be attributable to genes with large effect sizes, although none have been clearly identified in these CNVs by previous studies.35,41 Alternatively, the association of these 2 CNVs with IQ might be overestimated in the literature because carriers of these deletions are mostly referred to the clinic for behavioral or neurologic symptoms (eg, epilepsy). Although enrichment of the 17p12/hereditary neuropathy with liability to pressure palsies deletion was not previously reported in a neurodevelopmental cohort,7 our model predicts an IQ loss of 6 points. This finding is consistent with the enrichment observed in the CHU-SJ neurodevelopmental cohort (odds ratio, 3.25; 14 cases per 16 586; P = .002) and previous studies42-44 reporting association with schizophrenia (odds ratio ranging from 1 to 5).
Our study quantifies the association between the effect size of deletions on IQ and the frequency at which they occur de novo. The probability of occurring de novo increases rapidly for deletions with small effect sizes on IQ (a few points), reaching a frequency of 100% for effect sizes of 30 points or greater. The model’s prediction has a concordance of 0.75 with the de novo frequency of 15 recurrent CNVs calculated using empirical data. It is likely that many de novo deletions, which confer significant risk for NDDs, may lie on a continuum between class 1 and class 2 variants.37 In fact, most deletions that affect IQ have effect sizes of less than 30 points, are present in general population cohorts, and would be classified as class 2 de novo variants, which incorrectly reflects the risk they confer for NDDs.
The predictive models presented in the study have several limitations. In particular, they are unable to attribute large effects to ID causal genes. This limitation is likely because calibration was performed in the general population (with too few cases of ID) and reliance on haploinsufficiency scores that were not intended to provide granularity among genes with large effects. Indeed, the model attributes a maximal effect of 2.74 points of PIQ loss, whereas causal genes for ID are associated with IQ loss between 40 and 60 points.35,41 On the other hand, it is likely that our model properly estimated small effect size because it was developed and calibrated in the general population based on a set of CNVs that contain genes with milder effects.
The association of deletions with IQ can be modeled using haploinsufficiency scores based on a linear and additive assumption. Observations in the general population can estimate the effect sizes of recurrent pathogenic CNVs identified in the clinic. Results suggest that the frequency of de novo events can reliably estimate the effect size of a deletion on IQ. This method represents a new framework to study variants too rare to perform individual association studies and can be useful to estimate the cognitive effect of undocumented deletions (http://www.minds-genes.org/Site_EN/CNVsPredictionTools.html) identified in the neurodevelopmental clinic. Larger sample sizes and more refined models in cohorts, including individuals with IDs, are likely required to model the effects of duplications.
Accepted for Publication: January 8, 2018.
Corresponding Author: Sébastien Jacquemont, MD, Center Hospitalier Universitaire Sainte-Justine, 3175 Chemin de la Côte-Sainte-Catherine, Montréal, QC H3T 1C5, Canada (email@example.com).
Published Online: March 21, 2018. doi:10.1001/jamapsychiatry.2018.0039
Author Contributions: Drs Huguet and Schramm share first authorship. Drs Bourgeron and Jacquemont share last authorship. Dr Jacquemont had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Huguet, Schramm, Douard, Loth, Conrod, Greenwood, Paus, Bourgeron, Jacquemont.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: Huguet, Schramm, Douard, Mathieu, Poline, Greenwood, Paus, Bourgeron, Jacquemont.
Critical revision of the manuscript for important intellectual content: Huguet, Schramm, Jiang, Labbe, Tihy, Mathonnet, Nizard, Lemyre, Loth, Toro, Schumann, Pausova, Conrod, Greenwood, Paus, Bourgeron, Jacquemont.
Statistical analysis: Huguet, Schramm, Douard, Jiang, Labbe, Poline, Greenwood, Bourgeron.
Obtained funding: Huguet, Schumann, Pausova, Conrod, Paus, Bourgeron, Jacquemont.
Administrative, technical, or material support: Schramm, Tihy, Lemyre, Mathieu, Toro, Schumann, Conrod, Paus, Bourgeron, Jacquemont.
Study supervision: Labbe, Pausova, Conrod, Greenwood, Paus, Bourgeron, Jacquemont.
Conflict of Interest Disclosures: None reported.
Funding/Support: This research was enabled by support provided by Calcul Quebec (http://www.calculquebec.ca) and Compute Canada (http://www.computecanada.ca). Dr Bourgeron is supported by the Institut Pasteur, the University Paris Diderot, Centre National de la Recherche Scientifique, and the Bettencourt-Schueller Foundation. Dr Jacquemont is a recipient of a Bursary Professor fellowship of the Swiss National Science Foundation, a Canada Research Chair in neurodevelopmental disorders, and a chair from the Jeanne et Jean Louis Levesque Foundation. Dr Huguet is supported by the Sainte-Justine Foundation, the Merit scholarship program for foreign students, and the Network of Applied Genetic Medicine fellowships. Dr Schramm is supported by the Institute for Data Valorization fellowship. Dr Loth is supported by European Autism Interventions, which receives support from the Innovative Medicines Initiative Joint Undertaking under grant agreement 115300, the resources of which are composed of financial contributions from grant FP7/2007-2013 from the European Union's Seventh Framework Programme, the European Federation of Pharmaceutical Industries and Associations companies’ in-kind contributions, and Autism Speaks. This work is supported by a grant from the Brain Canada Multi Investigator initiative (Dr Jacquemont). The Canadian Institutes of Health Research and the Heart and Stroke Foundation of Canada fund the Saguenay Youth Study. Funding for the project was provided by the Wellcome Trust.
Role of the Funder/Sponsor: The funder had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Group Members: The members of the IMAGEN Consortium are as follows: Tobias Banaschewski, MD, PhD, Department of Child and Adolescent Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany; Gareth Barker, PhD, Centre for Neuroimaging Sciences, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, England; Arun L. W. Bokde, PhD, Discipline of Psychiatry, School of Medicine and Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland; Uli Bromberg Dipl-Psych, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany; Christian Büchel, MD, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany; Erin Burke Quinlan, PhD, Medical Research Council, Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, England; Sylvane Desrivières PhD, Medical Research Council, Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, England; Herta Flor, PhD, Department of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, and Department of Psychology, School of Social Sciences, University of Mannheim, Mannheim, Germany; Vincent Frouin PhD, NeuroSpin, CEA, Université Paris-Saclay, Gif-sur-Yvette, France; Hugh Garavan PhD, Departments of Psychiatry and Psychology, University of Vermont, Burlington; Penny Gowland, PhD, Sir Peter Mansfield Imaging Centre School of Physics and Astronomy, University of Nottingham, University Park, Nottingham, United Kingdom; Andreas Heinz, MD, PhD, Department of Psychiatry and Psychotherapy, Charité, Universitätsmedizin Berlin, Berlin, Germany; Bernd Ittermann, PhD, Physikalisch-Technische Bundesanstalt, Braunschweig and Berlin, Germany; Jean-Luc Martinot, MD, PhD, Institut National de la Santé et de la Recherche Médicale, INSERM Unit 1000 Neuroimaging & Psychiatry, University Paris Sud, University Paris Descartes - Sorbonne Paris Cité and Maison de Solenn, Paris, France; Marie-Laure Paillère Martinot, MD, PhD, Maison de Solenn, Cochin Hospital, Paris, France; Eric Artiges, MD, PhD, Institut National de la Santé et de la Recherche Médicale, INSERM Unit 1000 Neuroimaging & Psychiatry, University Paris Sud, University Paris Descartes-Sorbonne Paris Cité and Psychiatry Department, Orsay Hospital, Orsay, France; Herve Lemaitre, PhD, Institut National de la Santé et de la Recherche Médicale, INSERM Unit 1000 Neuroimaging & Psychiatry, Faculté de Médecine, Université Paris-Sud, Le Kremlin-Bicêtre, and Université Paris Descartes, Sorbonne Paris Cité, Paris, France; Frauke Nees, PhD, Department of Child and Adolescent Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, and Department of Cognitive and Clinical Neuroscience, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany; Dimitri Papadopoulos Orfanos, PhD, NeuroSpin, CEA, Université Paris-Saclay, Gif-sur-Yvette, France; Tomáš Paus, MD, PhD, Rotman Research Institute, Baycrest and Departments of Psychology and Psychiatry, University of Toronto, Toronto, Ontario, Canada; Luise Poustka, MD, Department of Child and Adolescent Psychiatry and Psychotherapy, University Medical Centre Göttingen, Göttingen, Germany, and Clinic for Child and Adolescent Psychiatry, Medical University of Vienna, Vienna, Austria; Sarah Hohmann, MD, Department of Child and Adolescent Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany; Sabina Millenet, Dipl-Psych, Department of Child and Adolescent Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany; Juliane H. Fröhner, Dipl-Psych, Department of Psychiatry and Neuroimaging Center, Technische Universität Dresden, Dresden, Germany; Michael N. Smolka, MD, Department of Psychiatry and Neuroimaging Center, Technische Universität Dresden, Dresden, Germany; Henrik Walter, MD, PhD, Department of Psychiatry and Psychotherapy, Charité, Universitätsmedizin Berlin, Berlin, Germany; Robert Whelan, PhD, School of Psychology and Global Brain Health Institute, Trinity College Dublin, Dublin, Ireland; and Gunter Schumann, MD, Medical Research Council–Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, England.
Additional Contributions: Julien Buratti (Institute Pasteur), and Vincent Frouin, PhD (Neurospin), acquired data for IMAGEN. Manon Bernard, BSc (database architect, The Hospital for Sick Children), and Helene Simard, MA, and her team of research assistants (Cégep de Jonquière) acquired data for the Saguenay Youth Study. Maude Auger, PgD (Center Hospitalier Universitaire Sainte-Justine), provided website development. This study makes use of data generated by the DECIPHER Consortium. A full list of centers that contributed to the generation of the data is available from http://decipher.sanger.ac.uk and via email from firstname.lastname@example.org. Dr Paus is the Tanenbaum Chair in Population Neuroscience at the Rotman Research Institute, University of Toronto, and the Dr John and Consuela Phelan Scholar at Child Mind Institute, New York.
Create a personal account or sign in to: