Figure 1. Promoter-associated trimethylated H3K4 levels in prefrontal neurons show very high genome-wide correlation between subjects. Heatmaps show Pearson correlation coefficients (r) for sample-to-sample comparison (controls C1-14 and autism cases A1-16) of raw tag counts within promoters. The ages of the samples (in years) are listed on the x-axis, with males in blue and females in pink. Note that the 3 youngest control brains are less strongly correlated with older brains, while brains older than 1 year, including all autism and noninfant control brains, mostly show very high between-sample correlations (r > 0.97).
Figure 2. Clustered trimethylated H3K4 (H3K4me3) profiles within a±2–kilobase (kb) window around all RefSeq transcription start sites (TSSs) for control C7 (A) and autism case A5 (C). Each row is a±2-kb window, with the TSS located in the center of the window and the direction of transcription toward the right. The strength of the H3K4me3 signal is shown on a color scale of red (strong signal) to white (intermediate signal) to blue (weak signal). There is a streak of blue immediately to the left of the TSSs for clusters 1 through 6, corresponding to the nucleosome-free region immediately upstream of the TSS. Note the extensive spreading of H3K4me3 signal in case A5 compared with control C7 for clusters 1 through 6 but not cluster 7. B, Line graphs show average H3K4me3 profiles for TSSs in each cluster in each sample. Controls are in blue (age<2 years) and gray (age >2 years), the 4 spreading autism samples are in red, and the remaining 12 autism samples are in black. Peaks in the line graphs indicate well-positioned nucleosomes. The H3K4me3 levels are normalized by the total number of reads in the±2-kb window.
Figure 3. Trimethylated H3K4 (H3K4me3) alterations specific to autism neurons. The University of California, Santa Cruz, Genome Browser tracks illustrate 10 representative autism susceptibility genes with altered H3K4me3 profiles specifically in prefrontal cortex neurons from subsets of autism cases (blue arrows). Orange tracks correspond to neuronal (NeuN+) nuclei chromatin in autism cases; green tracks, NeuN+ chromatin of controls; and blue tracks, nonneuronal (NeuN−) nuclei chromatin of the 4 spreading autism cases and 2 age-matched controls. The vertical axis of each graph indicates the number of sequence tags from anti-H3K4me3 chromatin immunoprecipitation, with a scale bar for 10 ppm (with parts being sequence tags). Note that except for SEMA5A, H3K4me3 spreading is observed in NeuN+ chromatin but not NeuN− chromatin.
Figure 4. Early infancy is associated with transition of trimethylated H3K4 landscapes in prefrontal cortex neurons. The 30 neuronal trimethylated histone H3K4 epigenomes (green indicates control subjects aged 0.5-1.3 years; blue, control subjects aged 2.8-69 years; and red, autistic subjects aged 2-70 years) are positioned in the space defined by the first 3 principal components (PC1, PC2, and PC3) as defined in the text. Note that the 4 youngest control brains (from controls C1-C4) together segregate from older controls and all autism samples.
Figure 5. Gene-specific trimethylated H3K4 (H3K4me3) alterations in subsets of autistic subjects. The University of California, Santa Cruz, Genome Browser tracks show H3K4me3 profiles at VGF and surrounding genes (A) and IFI6 (B) in 16 autistic subjects and 10 controls. Note decreased (for VGF) and excess (for IFI6) H3K4me3 signals in different sets of autism cases (blue arrows). The vertical axis of each graph indicates the number of sequence tags from anti-H3K4me3 chromatin immunoprecipitation, with a scale bar for 10 ppm. Scatterplots show correlation between H3K4me3 levels and relative messenger RNA (mRNA) levels for VGF (C) and IFI6 (D) genes, normalized to the promoter H3K4me3 level and mRNA level of housekeeping gene RPLP0. A subset of autism cases shows robust loss of VGF expression and H3K4me3 methylation at the VGF promoter (C), and another set of autism cases shows robust increase of IFI6 expression together with higher levels of H3K4me3 at the IFI6 promoter (D). ChIP-Seq indicates chromatin immunoprecipitation followed by deep sequencing; qPCR, quantitative polymerase chain reaction.
Figure 6. Trimethylated H3K4 changes at the 1p36 susceptibility locus. A schematic presentation shows a 28–megabase (Mb) region in 1p36, a subtelomeric portion on the short arm of chromosome 1, at risk for a deletion syndrome carrying high risk for neurodevelopmental disease, including autism (see the text). A subset of annotated genes in this portion are associated with altered trimethylated H3K4 levels in subsets of autism cases in this study as indicated. A deletion in this 1p36 region was reported for cases A14 and A16 (in bold) (supplemental Table S4, http://zlab.umassmed.edu/zlab/publications/ShulhaAGP2011.html). No copy number variations in this region were detected for autism cases A1, A9, A10, A12, A13, or A15. For the remaining 8 cases, no copy number variation data are available.
Shulha HP, Cheung I, Whittle C, Wang J, Virgil D, Lin CL, Guo Y, Lessard A, Akbarian S, Weng Z. Epigenetic Signatures of AutismTrimethylated H3K4 Landscapes in Prefrontal Neurons. Arch Gen Psychiatry. 2012;69(3):314-324. doi:10.1001/archgenpsychiatry.2011.151
Author Affiliations: Program in Bioinformatics and Integrative Biology (Drs Shulha, Wang, and Weng) and Brudnick Neuropsychiatric Research Institute (Drs Cheung, Guo, and Akbarian, Mss Whittle and Lin, and Mr Virgil), University of Massachusetts Medical School, Worcester; and Maryland Psychiatric Research Center, University of Maryland, Baltimore (Dr Lessard).
Context Neuronal dysfunction in cerebral cortex and other brain regions could contribute to the cognitive and behavioral defects in autism.
Objective To characterize epigenetic signatures of autism in prefrontal cortex neurons.
Design We performed fluorescence-activated sorting and separation of neuronal and nonneuronal nuclei from postmortem prefrontal cortex, digested the chromatin with micrococcal nuclease, and deeply sequenced the DNA from the mononucleosomes with trimethylated H3K4 (H3K4me3), a histone mark associated with transcriptional regulation. Approximately 15 billion base pairs of H3K4me3-enriched sequences were collected from 32 brains.
Setting Academic medical center.
Participants A total of 16 subjects diagnosed as having autism and 16 control subjects ranging in age from 0.5 to 70 years.
Main Outcome Measures Identification of genomic loci showing autism-associated H3K4me3 changes in prefrontal cortex neurons.
Results Subjects with autism showed no evidence for generalized disruption of the developmentally regulated remodeling of the H3K4me3 landscape that defines normal prefrontal cortex neurons in early infancy. However, excess spreading of H3K4me3 from the transcription start sites into downstream gene bodies and upstream promoters was observed specifically in neuronal chromatin from 4 of 16 autism cases but not in controls. Variable subsets of autism cases exhibit altered H3K4me3 peaks at numerous genes regulating neuronal connectivity, social behaviors, and cognition, often in conjunction with altered expression of the corresponding transcripts. Autism-associated H3K4me3 peaks were significantly enriched in genes and loci implicated in neurodevelopmental diseases.
Conclusions Prefrontal cortex neurons from subjects with autism show changes in chromatin structures at hundreds of loci genome-wide, revealing considerable overlap between genetic and epigenetic risk maps of developmental brain disorders.
Autism spectrum disorders comprise a group of complex and etiologically heterogeneous illnesses. The genetic risk architecture remains unknown for most patients, and fewer than 10% of subjects on the autism spectrum harbor rare structural DNA variations and mutations with strong penetrance. Neurons residing in the prefrontal cortex (PFC) and other cortical association areas in autistic subjects are affected by subtle defects in connectivity patterns, cytoarchitecture, and other structural alterations.1- 3 While the role of such disordered neural circuitry in the context of neurodevelopmental disease is well understood, including impairment of higher cognition and social communication, very little is known about the molecular pathology of PFC neurons in the autistic brain.
Epigenetic dysregulation of DNA methylation and histone modifications could play a prominent role in the pathophysiology of autism and related disease.4- 11 This hypothesis is in part based on the link between autism and deleterious mutations in genes considered of pivotal importance for regulation of chromatin structure and function, such as MECP27 and other methyl-CpG-binding proteins,12,13 the histone deacetylase HDAC4,14 the histone H3 lysine 9 (H3K9)–specific methyltransferase EHMT1/KMT1D/GLP,15 and the H3K4 demethylase JARID1C/KDM5C/SMCX.16 In particular, the trimethylated form of H3K4 (H3K4me3) is primarily located at transcription start sites (TSSs) and linked to the serine 5-phosphorylated, initiation form of RNA polymerase II, thereby providing a docking site at the 5′ end of genes for chromatin remodeling complexes that mostly facilitate (but in some cases repress) transcription.17,18 Epigenetic fine-tuning of H3K4me3 appears to be particularly important for neuronal health. For example, neuronal differentiation is dependent on H3K4 trimethylation mediated by the mixed-lineage leukemia (MLL) methyltransferase and transcriptional activation of RE-1 silencing transcription factor (REST)–sensitive genes.19,20 Furthermore, hippocampal learning and memory require MLL1 -mediated H3K4 trimethylation at growth- and plasticity-regulating genes.21,22
Recently, we presented the first neuronal-specific epigenomes from human brain and provided evidence that nuclei of PFC neurons from normal infants (aged<1 year) showed an excess of H3K4me3 at hundreds of loci compared with older brains.23 This large-scale remodeling of the H3K4me3 landscape in chromatin of young neurons speaks for epigenetic vulnerability of the immature PFC, an intriguing hypothesis when viewed from the neurodevelopmental perspective of autism. Therefore, the aim of this study was to compare, for the first time to our knowledge, the H3K4me3 epigenomes of the PFC neurons of autistic individuals with a panel of controls across a wide age range from infancy to 70 years. Our findings indicate that a subset of autistic individuals are affected by loss or excess of H3K4me3 at hundreds of loci, in conjunction with dysregulated expression of transcripts implicated in neuronal communication and social and other higher-order behaviors. We propose a new disease model for PFC neurons that involves changes in the H3K4me3 landscape in hundreds of loci. These changes are highly variable between different patients and point to a complex interaction between the genetic and epigenetic risk architectures that affect autism spectrum disorders.
The eAppendix has a detailed description of the methods and techniques used. Sorted neuronal (NeuN+) and nonneuronal (NeuN−) nuclei were processed for anti-H3K4me3 chromatin immunoprecipitation (ChIP), and resulting DNA libraries were deep sequenced with an Illumina GAII platform (Illumina, Inc, San Diego, California). Quantification of uniquely mappable sequence tags was based on MACS (Model-Based Analysis of ChIP-Seq) version 1.3.5 software,24 and signals were expressed as parts (tags) per million. Additional bioinformatics approaches to analyze disease-associated changes are provided in the eAppendix.
We previously developed a technique of fluorescence-activated cell sorting of nuclei according to immunoreactivity for the NeuN antigen.25 This allows us to compare changes in the histone methylation landscapes of terminally differentiated neurons across the lifespan, including developmental periods and disease conditions such as autism,26,27 without being affected by shifts in the neuron-glia ratio, which could greatly confound interpretation of conventional chromatin studies from tissue homogenates. We then digested chromatin with micrococcal nuclease, extracted the mononucleosome fraction, and performed ChIP with an antibody against H3K4me3 followed by deep sequencing. We obtained genome-wide maps of the H3K4me3 mark for NeuN+ nuclei from the PFC of 16 subjects diagnosed as having autism spectrum disorders (aged 2-60 years; mean age, 17.6 years; 4 females; supplemental Table S1, http://zlab.umassmed.edu/zlab/publications/ShulhaAGP2011.html) and 10 age-matched control subjects (aged 2.8-69 years; mean age, 19.7 years; 3 females). In addition, we included 4 control subjects younger than 2 years (1 female) to investigate whether H3K4me3 profiles of autistic subjects were more similar to those of normal infants than to those of older control subjects. Nine of the 14 control samples were generated in our earlier study.23 We also sequenced an input library from the neuronal nuclei of 1 control subject, following all the steps of ChIP with deep sequencing but omitting the anti-H3K4me3 antibody. Altogether, 333 million 36-nucleotide reads were obtained for 31 neuronal samples, among which 283 million reads mapped to unique locations in the reference human genome and an additional 32 million reads mapped to multiple locations in the genome. Furthermore, we performed ChIP followed by deep sequencing on NeuN− chromatin of 4 of the autism cases and 2 additional control subjects (supplemental Table S2).
Consistent with the knowledge that H3K4me3 is a histone mark sharply enriched around the 5′ end of genes, a mean (SD) of 59% (9%) of reads in the 14 control samples mapped within RefSeq promoters (http://www.ncbi.nlm.nih.gov/RefSeq/). Similarly, a mean (SD) of 57% (11%) of reads in the autism samples mapped in proximal promoters (supplemental Table S1). There is no statistical difference in the percentage of tags mapping to promoters between autism and control samples, indicating that autistic patients do not show global displacement of H3K4me3 away from promoters. In sharp contrast, fewer than 4% of reads from the input library mapped to promoters. Postmortem confounds—tissue pH and postmortem interval—do not show significant correlations with the percentage of TSS-proximal tags (correlation coefficients,−0.12 and−0.19, respectively; t tests, 2-tailed P = .55 and .31, respectively).
To investigate the global H3K4me3 occupancy in promoters, we computed the correlation coefficient for the numbers of tags in the entire set of 21 084 annotated promoters between any 2 samples. Figure 1 shows that all 30 samples are highly similar, with correlations ranging from 0.93 to 0.99. The autism samples do not show appreciable difference from controls: the mean (SD) correlation between 16 autism samples and 10 controls of the same age range is 0.969 (0.011), while the mean (SD) correlations are 0.971 (0.012) within autism samples and 0.974 (0.008) within controls. The most distinct samples are from the 3 youngest infants C1-C3 (aged<1 year), indicated by the dark shades of the first 3 columns in Figure 1. They are equally distinct from autism samples and older control subjects; the mean (SD) correlation coefficients are 0.951 (0.009) and 0.954 (0.009), respectively. Our results suggest that autistic patients do not exhibit global difference in H3K4me3 occupancy at annotated promoters. Furthermore, all autism samples included in this study appear to have an age-appropriate promoter H3K4me3 landscape, more similar to control subjects older than 2 years than to control infants.
As reported previously, the nucleosomes around TSSs are highly enriched in H3K4me3, with the nucleosome immediately downstream from the TSS (the +1 nucleosome) showing the strongest signal.28 Indeed, we observed this phenomenon and further noticed that some TSSs have strong H3K4me3 signals in multiple downstream nucleosomes. To identify these TSSs systematically, we performed K-means clustering (K = 7) on all TSSs of a control sample (C7) according to the H3K4me3 profiles in a±2–kilobase (kb) window centered on the TSS (clustering with other control samples led to highly similar results). Figure 2A shows the heatmap of the resulting clusters sorted inversely by average H3K4me3 signal.
We analyzed whether genes in each cluster were enriched in Gene Ontology terms. Significantly enriched Gene Ontology terms are shown in supplemental Table S3. Cluster 1 contains genes with the strongest average H3K4me3 signal. In addition, genes in cluster 1 exhibit the largest number of H3K4me3-enriched nucleosomes compared with genes in clusters 2 through 6. Six hundred two genes in cluster 1 were among the 1721 genes we previously defined by contrasting the levels of H3K4me3 in PFC neurons and in lymphocytes,23 and the overlap of the 2 sets of genes is highly significant (χ2 test, P < 2.2 × 10−16). Accordingly, genes in cluster 1 are highly enriched in neuronal functions. The most enriched Gene Ontology terms of these genes include nervous system development, generation of neurons, synapse, etc (false discovery rate < 3.8 × 10−16). In contrast, clusters 2 through 6 are enriched in various housekeeping Gene Ontology terms, while cluster 7, enriched in immune response, etc, showed only very low H3K4me3 levels in PFC neurons.
We plotted the H3K4me3 heatmaps of the other 29 samples with TSSs in the same order as in Figure 2A. To compare H3K4me3 profiles across individuals, we generated line graphs, 1 line per individual, representing average H3K4me3 profiles of all genes in each cluster (Figure 2B). Four autism samples (A5, A8, A9, and A13) stood out as showing H3K4me3 profiles spreading away from the TSSs and encroaching into gene bodies and upstream sequences. The heatmap of sample A5 is shown in Figure 2C, and the remaining 28 samples are shown in supplemental Figure S1. No clinical variables appeared to be common to these 4 cases (supplemental Table S1). Figure 2B (red lines) clearly shows that the average H3K4me3 profiles from these 4 samples spread and plateau beneath maximal peak heights of all controls and the remaining autism samples. These spreading profiles are seen for genes in clusters 1 through 6 but not for those in cluster 7 (Figure 2C) and include multiple susceptibility genes previously implicated in autism (Figure 3 and supplemental Table S4).
To evaluate the significance of spreading, we used cluster 1 as the representative and tested the average H3K4me3 levels at 4 positions sampled around the H3K4me3 peak:−2 kb, 0 kb, +1 kb, and +3 kb from the TSS. There was a significant difference between the 4 spreading autism cases and the remaining 12 cases (t test, 2-tailed P = 4.0 × 10−3, 9.5 × 10−4, 9.5 × 10−6, and .01) and all control cases (P = 4.3 × 10−3, 5.0 × 10−3, 8.4 × 10−5, and .02) but not between the 12 nonspreading cases and controls (P > .12). Spreading of the H3K4me3 signal in cases A5, A8, A9, and A13 was not due to an unbalanced number of tags mapping to the 2 genomic strands, because a mean (SD) of 49.97% (0.02%) of genome mapping reads map to the Watson (+) strand in all cases and controls (supplemental Table S5).
Spreading can lead to additional H3K4me3 peaks (genomic regions enriched in H3K4me3 are called H3K4me3 peaks). For example, case A5 shows an additional peak near the NRCAM gene, which is absent from all other samples (Figure 3). We performed ChIP–quantitative polymerase chain reaction on this peak for case A5 and 3 controls (C8, C10, and C12), using the housekeeping gene B2M as control. Indeed, we detected an elevated H3K4me3 signal in case A5: log10(NRCAM /B2M) = −3.67, compared with controls C8 (−4.54), C10 (−3.91), and C12 (−4.29). Excess spreading of H3K4me3 was further verified in cases A5 and A13 by probing AHCYL1 TSSs on chromosome 1 with ChIP–quantitative polymerase chain reaction (supplemental Figure S2).
To test whether the spreading was a general phenomenon affecting all PFC cells in the 4 autism cases, we conducted additional experiments of H3K4me3 ChIP with deep sequencing on NeuN− nuclei of cases A5, A8, A9, and A13 and 2 additional controls (supplemental Table S2) chosen to match the ages of the 4 spreading autism cases. The NeuN− nuclei are primarily from nonneuronal cells, including glia, microglia, and endothelium. The H3K4me3 heatmaps of these 6 samples with TSSs in the same order as in Figure 2A, as well as the line graphs of the 7 clusters as defined in Figure 2, are plotted in supplemental Figure S3. The H3K4me3 profiles of the NeuN− chromatin from cases A5, A8, A9, and A13 are clustered around those of the 2 control samples and do not show any spreading behavior (supplemental Figure S3). We conclude that neuronal chromatin from a subset of autism cases exhibits excessive spreading of the H3K4me3 mark into nucleosomes positioned farther away from the TSS.
Having analyzed H3K4me3 occupancies (Figure 1) and profiles (Figure 2) at annotated promoters, we next asked whether the autism samples differ from controls in nonpromoter regions. We performed unsupervised learning using H3K4me3-based fingerprints computed as follows. For each sample, we determined the total number of genomic regions (not restricted to promoters) enriched in H3K4me3 compared with another sample as the background. Such regions are called H3K4me3 peaks (details in the eAppendix). We used all 30 samples (16 autism samples and 14 control samples); thus, the fingerprint of each sample is a vector of 30 numbers, each being the number of H3K4me3 peaks in the sample using another sample as the background.
We performed principal components analysis on the fingerprints of the 30 samples. Principal components analysis is a mathematical method that reduces the dimensionality of data by identifying directions, called principal components, that maximize the variation in the data. Samples can then be plotted along these principal components to visually assess whether they form clusters. Principal components analysis is an unsupervised learning method, unaware of sample identities (autism or control). Figure 4 shows our 30 samples plotted against the first 3 principal components. Strikingly, the 4 youngest control subjects, C1 through C4, ranging in age from 0.5 to 1.3 years, are located at one end of the graph, far from the space densely occupied by the older control subjects and most autism cases. The next 3 youngest control subjects (C5-C7) and the youngest autistic subject (A1; aged 2 years) bridge the space that separates infants from the remaining subjects. We can draw 2 conclusions from these results. First, the H3K4me3 landscape in PFC neurons undergoes significant remodeling during at least the first 12 months after birth, as previously reported in a smaller cohort.23 Second, no autistic subjects of our study showed evidence for developmental arrest in this preprogrammed remodeling of PFC neuron chromatin.
Autism is a brain disorder with considerable disease heterogeneity in terms of clinical presentation, molecular and cellular pathology, and genetics.29 Therefore, we asked whether H3K4me3 alterations, if present in our clinical cohort, show interindividual variability and affect multiple cases. To explore this, we screened the H3K4me3 data sets from the 16 autistic subjects for peaks that consistently showed greater than a 2-fold upregulation or downregulation in comparison with each of the 10 controls. We thus identified 503 loci as increased and 208 loci as decreased in a subset of autistic subjects (supplemental Table S4, Figure S4, and Figure S5). We call these autism-up and autism-down H3K4me3 loci, respectively, and 330 autism-up loci and 139 autism-down loci overlap with promoters. We then applied Poisson statistics; the significant peaks (P < .001 after Bonferroni multiple-testing correction) are included in supplemental Table S4. Most (but not all) of the autism-up loci were derived from the 4 cases with spreading H3K4me3 profiles (cases A5, A8, A9, and A13 described earlier); furthermore, cases A5 and A13 had the largest shares of the autism-down loci (supplemental Figure S4 and Figure S5). We further validated the autism-associated H3K4me3 changes of 2 more genes by quantitative polymerase chain reaction (supplemental Figure S2).
The H3K4me3 functions as a docking site for a diverse set of transcriptional regulators that could activate or repress gene expression. It has been shown that developmental or disease-associated changes in brain histone methylation are linked to alterations of the corresponding gene transcripts.30- 32 Therefore, we asked whether autism cases with altered H3K4me3 levels also showed altered levels of the corresponding messenger RNA (mRNA). We chose 5 genes with decreased H3K4me3 over the TSSs in at least 2 autism cases from a larger set of genes proximal to autism-up H3K4me3 loci, which are implicated in neurodevelopment and cognition based on preclinical and/or clinical research (supplemental Table S4). The following 5 genes were included: (1) the synaptic protein ARC, which interacts with the ubiquitin ligase and autism/Angelman syndrome risk gene UBE3A to regulate synaptic AMPA glutamate receptor trafficking33; (2) complexin 1 (CPLX1), a protein involved in exocytosis and synaptic vesicle functions, which is essential for social behavior and motivational behaviors in mice and is expressed at altered levels in various neuropsychiatric diseases34; (3) the orphan nuclear receptor NR4A1 genetically linked to involuntary movement disorders35; (4) USPx, a dendritic protein implicated in neuronal and synaptic connectivity36; and (5) the neuropeptide VGF, which regulates neuronal plasticity and is expressed at decreased levels in postmortem brain of subjects diagnosed as having manic-depressive illness.37 Strikingly, all autistic subjects with greater than a 2-fold decrease in TSS-associated H4K4me3 peak density also exhibited some of the lowest transcript levels in comparison with controls and the remaining autism cases (Figure 5A and supplemental Figure S6). These changes were highly specific because the nearest neighboring peaks were not altered in the affected autism cases. Furthermore, RNA quantifications were similar independent of the choice of the housekeeping or normalization gene(s) (supplemental Figure S6). In comparing demographic, clinical, and postmortem indices between cases with a decline in specific H3K4me3 peaks and levels of corresponding mRNAs, we did not observe any factor (including seizures, medication histories, or postmortem confounds) as being common to these cases other than the diagnosis of autism (supplemental Table S1).
Next, we probed RNA levels for 3 autism-up loci. For 2 of the 3 transcripts tested, IFI6, which encodes inducible interferonα6, and ALDH1A2, which is a schizophrenia susceptibility gene linked to retinoic acid signaling,38 cases with high H3K4me3 signal tended to show a robust increase in mRNA levels (Figure 5B and supplemental Figure S6). We conclude that in PFC neurons, aberrant H3K4 methylation at a specific TSS indeed is a strong predictor for transcriptional dysregulation, which affects variable subsets of subjects on the autism spectrum.
We then asked whether some of the 711 loci with altered H3K4me3 signal (the aforementioned autism-up and autism-down peaks) were in the vicinity of any of the established autism susceptibility loci or genes with strong penetrance of disease risk when mutated. We examined 3 databases: the Simons Foundation Autism Research Initiative database (SFARI) for autism risk genes,39 the Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources (DECIPHER) for copy number variations (CNVs) associated with diseased neurodevelopment,40 and the Human Unidentified Gene-Encoded protein database (HuGE) for autism-associated polymorphisms.41 Indeed, 89 of 729 loci matched loci in these databases. The list included at least 10 autism risk genes (ASTN2, CACNA1C, CACNA1H, JMJD1C, MEF2C, NRCAM, RAI1, RIMS3, RYR2, and SEM A5A) with abnormal H3K4me3 levels in their promoters and additional disease genes showing H3K4me3 changes in other portions of the gene, such as AUTS2, PARK2, and VGF (supplemental Table S4). For the various subsets of autism cases affected by the epigenetic dysregulation at these genetic risk loci, the altered H3K4me3 profile was specific for neuronal chromatin from PFC, while NeuN− chromatin from PFC appeared unaffected (Figure 3). For the entire set of 16 autism cases, enrichment in the overlap between the autism-up and autism-down peaks and the loci in DECIPHER (58 loci in total) is 1.87-fold and 2.44-fold, respectively.40 Examples include portions of 1p36 associated with one of the most frequent genetic causes of non–X-linked mental retardation42 and a portion of 17p associated with deletion (Smith-Magenis) or duplication (Potocki-Lupski) syndromes conferring developmental disability and autism43 (supplemental Table S4). There is also a significant overlap between the autism-up H3K4me3 peaks and autism-implicated genes in SFARI (χ2 test, P = .009) as well as between autism-down H3K4me3 peaks and autism-implicated genes in HuGE (P = .04). We conclude that epigenetic alterations in PFC neurons of autistic subjects include a significant subset of previously identified disease-associated genes.
Because our mapping algorithm allowed for 1 mismatch per 36-nucleotide read, it is unlikely that the changes in H3K4me3 profiles at these autism risk genes are due to single-nucleotide polymorphisms of the H3K4me3-enriched DNA. To explore whether the observed H3K4me3 changes in our autism samples are the result of CNVs, we screened published data sets from 8 of our autism cases for microdeletions, microduplications, and other CNVs44 (cases A1, A9, A10, and A12-16). Remarkably, from 178 H3K4me3 peaks with a decrease or loss in various subsets of these 8 cases, only 4 peaks (2.2%) were located to a microdeletion. The deletion in case A16 involved 22q11.23, a region frequently affected by segmental duplication upstream of the 22q11 DiGeorge syndrome/velocardiofacial syndrome risk locus45 (supplemental Figure S7). Two cases (A14 and A16) were affected by a partial loss of 1p36.33, representing one form of a subtelomeric deletion syndrome (supplemental Table S4). In most of the affected cases, the syndrome (with a prevalence of approximately 1 in 10 000 individuals) involves several hundred kb or megabases (Mb) of gene-rich portions within the subtelomeric region of chromosome 1 and frequently results in developmental delay and neurological disease.42,46 In our cohort of 16 autistic subjects, we detected 14 autism-up and 8 autism-down peaks in a 28-Mb portion of 1p36.32-33, including multiple cases with no detectable CNV at these loci (Figure 6). Furthermore, none of the 342 peaks with excess H3K4me3 (supplemental Table S4) were associated with CNVs. We conclude that structural variations including CNVs are unlikely to play a role for most of the observed H3K4me3 changes in our cohort of autism cases.
To confirm that some of the H3K4me3 changes in these autism risk genes are associated with dysregulated gene expression, we measured the RNA for RIMS3, which encodes a synaptic vesicle protein associated with very rare, autism-associated structural DNA variations, including the 1p34.2 microdeletion and, in some families, changes in coding exon sequences.47RIMS3 also shows altered expression in lymphoblastoid cell lines derived from autistic patients.48 Indeed, like the 5 previously described genes with a robust decrease in H3K4me3, autistic subjects with the lowest H3K4me3 signal at the RIMS3 TSS also expressed low levels of RIMS3 transcript (supplemental Figure S6).
To our knowledge, this study provides the first insights into chromatin structures of PFC neurons from subjects in the autism spectrum. Remarkably, none of our 16 disease cases showed evidence for a generalized disruption of the genome-wide redistribution process of H3K4me3 that defines the normal transition from early infancy (age<1 year after birth) to older ages. Instead, we identified 503 increased loci and 208 decreased loci in the genome, including but not limited to annotated TSSs, that were affected by at least a 2-fold increase or decrease in H3K4me3 peak densities in subsets of autism cases compared with controls. While each of these altered loci affected variable subsets of cases, the majority of them are from 4 diseased brains with an abnormal H3K4me3 profile around the TSS, defined by lower H3K4me3 intensity at the TSS and spreading away from the TSS. Because diagnosis of autism was the only identifiable factor common to these cases, it is likely that this abnormal H3K4me3 profile is related to autism. One spreading case and 1 nonspreading case were exposed to valproic acid, a drug that upregulates histone acetylation in the brain (supplemental Table S1).31,49,50 Nonetheless, additional work with larger cohorts will be necessary to clarify the biological significance of the H3K4me3 spreading and the implications for overall nucleosomal organization around the TSS.
Our observations add further evidence that epigenetic fine-tuning of H3K4 methylation is particularly important for orderly brain development and function19- 22,31 and suggest maladaptive chromatin remodeling with imbalances in H3K4me3 distribution as a molecular phenotype of cortical neurons in autism. Indeed, several lines of evidence from this study suggest that these alterations in H3K4me3 profiles of PFC neurons are directly involved in the molecular pathology of the disease in the affected individuals. For example, as exemplified by ARC, CPLX1, USPx, RIMS3, and VGF— genes with a key role in the neurobiology of cognition and higher-order behaviors—most of the subjects affected by a robust decrease in H3K4me3 at the specific TSSs also showed a deficit in levels of the corresponding transcripts. Likewise, increased TSS-bound H3K4me3 was in some cases accompanied by a dramatic increase in transcript levels (compare IFI6 RNA and H3K4me3 in subject A6, Figure 5 and supplemental Figure S6). Given that the focus of this study was on neuronal chromatin, it might have been expected that immune-related genes did not play a major role. Alterations in the immune system of autistic subjects51 could reflect activation of microglia and other nonneuronal constituents residing in diseased tissue.27,52
The finding that genes and loci previously shown to confer genetic risk for autism and related disease significantly overlap with the 711 differential H3K4me3 peaks in our disease cohort implies that there is significant overlap between the genetic and epigenetic risk architectures in autism. It is noteworthy that these epigenetic changes are not limited to the 4 brains with the aforementioned spreading H3K4me3 profile, because the remaining 12 autism samples exhibit altered H3K4me3 at many of the same loci. There are 3 possible explanations for the H3K4me3 changes observed in subsets of autism samples: (1) variation in the underlying genomic DNA, which as discussed earlier is unlikely to account for a significant portion of the disease-associated H3K4me3 peaks given that the genome mapping algorithm allowed only a single base mismatch per tag; (2) heritable transmission of histone modifications53; and (3) a molecular adaptation in a pathophysiological cascade ultimately leading to cortical dysfunction.
Genetic models for an increasing number of neuropsychiatric disorders, including autism, invoke a complex mixture of disease-associated common variants of a select number of genes, each contributing to a small fraction of the disease risk, and rare variations such as microdeletions and duplications of key genomic loci and candidate genes that, when affected, carry a high penetrance of disease.29,54,55 Heterogeneity of disease was also apparent in this study because the 711 epigenetic risk loci were affected in variable subsets of autistic individuals. Of note, the list of H3K4me3 peaks in our small cohort of 16 autistic subjects included the synaptic vesicle gene RIMS3, the retinoic acid signaling–regulated gene RAI1,56 the histone demethylase JMJD1C,57 the astrotactin ASTN2,58 the adhesion molecules NRCAM59- 61 and SEMA5,62,63 the ubiquitin ligase PARKIN2 (PARK2),58,64 and other genes for which rare structural DNA variations could carry high disease penetrance. Intriguingly, for most of these high-risk genes, we observed that H3K4me3 changes occurred selectively in PFC neurons but not their surrounding nonneuronal cells (Figure 3). Our findings imply a new disease model suggesting that a subset of susceptibility genes carrying strong penetrance for disease could be epigenetically dysregulated in a cell (type)–specific manner, independent of the occurrence of any rare structural DNA variations so far implicated in neurological disease. Identification of the precise combinations of these aberrantly regulated gene sets in a particular disease case, including a potential association with the severity of specific disease symptoms, can bear great promise toward better understanding of the underlying neurobiology and, perhaps more importantly, could set the foundation for novel therapies specifically tailored toward individual patients.
None of the 16 autism cases in our cohort showed evidence for a generalized disruption of developmentally regulated chromatin remodeling as observed in (normal) infant PFC neurons. Hundreds of loci across the genome show disease-associated alterations in H3K4me3 levels. These defects occur in variable subsets of autistic individuals, are often highly specific for PFC neurons, and are frequently accompanied by dysregulated expression of the corresponding gene transcripts in the same subjects or brain tissues. Autism-associated H3K4me3 alterations are enriched for susceptibility loci conferring genetic risk for neurodevelopmental disease. Fewer than 5% of autism-associated H3K4me3 changes are directly related to CNVs. A subset of subjects with autism showed evidence for a more generalized alteration of their neuronal H3K4me3 landscapes as evidenced by excess spreading around many TSSs. The biological significance of this phenomenon requires further investigation.
Correspondence: Zhiping Weng, PhD, Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 364 Plantation St, LRB 1010, Worcester, MA 01605 (firstname.lastname@example.org).
Submitted for Publication: June 10, 2011; final revision received August 19, 2011 accepted September 11, 2011.
Published Online: November 7, 2011. doi:10.1001/archgenpsychiatry.2011.151
Author Contributions: Drs Shulha and Cheung contributed equally. Drs Akbarian and Weng are co–senior authors, had full access to all of the data in the study, and take responsibility for the integrity of the data and the accuracy of the data analysis.
Financial Disclosure: None reported.
Funding/Support: This work was supported by Autism Speaks, the International Mental Health Research Organization, the National Alliance for Research on Schizophrenia and Depression, and grants 5 R01MH071476 and RC1MH088047 from the National Institute of Mental Health (Dr Akbarian). Dr Weng was supported in part by grant DBI-0850008 from the National Science Foundation.
Role of the Sponsors: The sponsors had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; or preparation, review, or approval of the manuscript.
Additional Information: This article is dedicated to the memory of Edward G. Jones, MD, PhD.
Additional Contributions: Ron Zielke, MD, and staff from the Brain and Tissue Bank of the University of Maryland, Francine M. Benes, MD, PhD, and staff from the Harvard Brain Tissue Resource Center and the Autism Tissue Program (under director Jane Pickett, MD), William E. Bunney Jr, MD, from the University of California, Irvine, and Edward G. Jones, MD, PhD, from the University of California, Davis, supplied some of the postmortem brain tissue used in this study, and Ellen Kittler, PhD, and Maria Zapp, PhD, from the University of Massachusetts Medical School Deep Sequencing Core, Richard Konz, PhD, and staff from the University of Massachusetts Medical School Flow Cytometry Core, and Eric Morrow, PhD, MD, provided helpful discussions.