Association of Periconception Paternal Body Mass Index With Persistent Changes in DNA Methylation of Offspring in Childhood

Key Points Question Is paternal obesity associated with epigenetic marks and weight status in offspring? Findings This cohort study of 429 father-mother-infant triads found that paternal body mass index at the time of conception was associated with both offspring birth weight and epigenome-wide DNA methylation patterns in offspring at birth, age 3 years, and age 7 years. Meaning These findings suggest that paternal obesity may be an underappreciated contributor to childhood health outcomes.

manufacturer's protocols. The HumanMethylation450 BeadChip measures DNA methylation at >485,000 CpG sites simultaneously at a single nucleotide resolution; the chip includes one or more CpG sites for 99% of RefSeq genes.
We carried out pre-processing, filtering and quality control procedures for DNA methylation data as described previously for other analyses using Project Viva DNA methylation datasets. [1][2][3] The same procedures were followed for cord blood, early childhood, and midchildhood DNA methylation datasets. We processed raw DNA methylation files using the minfi package in R from Bioconductor. Samples were excluded as potentially mis-labelled if they were mismatched on sex or genotype (based on rs probes, as described 4 ), or deemed to be low in quality. Technical replicates (n=40) were also excluded from the analysis. Correlation coefficients for individual probes among all technical replicates ranged from 0.98 to 1. We excluded individual probes if they had non-significant detection (P>0.05) for more than 1% of the samples. Additionally, we excluded non-CpG probes (i.e. rs and ch) and probes in X and Y chromosomes. SNP-associated probes at either the single base extension or within the target region were removed for SNPs that have a minor-allele frequency of >5%. Previously identified non-specific and cross-reactive probes within the array along with polymorphic CpG loci were also excluded from the analysis. 5 For background correction and dye-bias equalization, we performed the normalexponential out-of-band (noob) correction method. 6 Finally, a β-mixture quantile intra sample normalization procedure (BMIQ) was applied to the resulting data to reduce the potential bias that can arise from type 2 probes 5 . For each CpG site, methylation is reported as average βvalue = M/(M + U + ε), where M and U represent the average fluorescence intensity from the probe corresponding to the methylated and unmethylated target CpG and ε = 100 is a small quantity to protect against division by zero. Thus, the average β-value is an interval scaled quantity between zero and one interpreted as the fraction of DNA molecules whose target CpG is methylated.
We used ComBat 7 to correct for batch effects from plate and other potential sources of technical variability in methylation measurements, including paternal BMI as the variable of interest. We visually inspected the effectiveness of adjustment for batch using the four main principal components before and after batch adjustment. Strip plots of control probes were visually examined for bisulfite conversion and specificity. Density plots for the β-values were examined across samples at each normalization step. Methylation values on the β-scale were logit transformed to M-values, as previously described, for differential analysis of DNA methylation.

CD4+, Natural Killer cells, monocytes, B-cells, granulocytes, and nucleated red blood cells)
(Model 1). Cell type proportions in cord blood were estimated from DNA methylation data using a reference panel of nucleated cells isolated from cord blood (leukocytes and nucleated red blood cells). 9,10 An adult leukocyte reference panel was used to estimate cell type proportions for blood samples collected in early-and mid-childhood as implemented in minfi 11 and as described previously. [1][2][3] In additional analyses to determine whether there is an interaction of maternal and paternal BMI on the offspring methylome, data were stratified by maternal BMI <25 and >=25 kg/m 2 (Model 2a and 2b) and paternal BMI was modeled as continuous variable. After inclusion of cell-type estimation and other adjustment covariates, the genomic inflation factor (λ) for the Epigenome-Wide Analysis was 1.23 for Model 1, 0.98 for Model 2a, and 1.34 for Model 2b, with values close to 1 indicating that results were unlikely driven by population stratification or cryptic relatedness (eFigure 1).
Statistical significance for the CpG-by-CpG analysis was corrected for multiple testing by controlling the false discovery rate; we considered associations with FDR q value <0.05 as statistically significant. We also report CpGs reaching a Bonferroni-adjusted level of significance (P<1.3 E-07) for cord blood analyses. Model fit and assumptions were examined using scatterplots of the standardized residuals vs. the fitted values, residuals vs. the leverage, and quantile-quantile plots of the standardized residuals. Further, multivariable robust linear regression models were used to examine the association between statistically significant cord blood CpG sites and infant phenotypes of birth weight, and birth weight for gestational age zscore. A P-value of <0.05 was taken as statistically significant in the analysis of DNA methylation versus birth weight only. To evaluate the persistence of association, we utilized multivariate robust linear regression models adjusting for covariates, and considered nominal P< 0.05 as indicating the persistence of epigenetic alterations in early or mid-childhood, as we tested only the individual CpG site that we already had found in cord blood reaching our FDR threshold.
We used gene annotations from Illumina HumanMethylation450 v1.2 Manifest File; for sites where annotations were missing we report the nearest gene symbol. Tissue expression annotations were manually searched in the NCBI Gene database based on an RNA-seq study representing 27 different tissues in 95 human individuals. 12 To evaluate potential regulatory activity of epigenetic associations a transcription factor binding site analysis and motif search were performed using the R packages MotifDb and motifRG, respectively. For transcription factor binding and motif search analyses, 10,000 random sequences of 2000 bp length surrounding CpG sites included on the Illumina HumanMethylation450 platform were used for background. All analyses were carried out using R/Bioconductor. eFigure 1. Q-Q plot and P-value distribution for CpG-by-CpG analysis in cord blood. e1A. Model 1, paternal BMI (exposure) versus DNA methylation (outcome), fully adjusted model including adjustment for pre-pregnancy maternal BMI. e1B. Model 2a, paternal BMI (exposure) versus DNA methylation (outcome), stratified by prepregnancy maternal BMI < 25 kg/m 2 . e1C. Model 2b, paternal BMI (exposure) versus DNA methylation (outcome), stratified by prepregnancy maternal BMI  25 kg/m 2 . The Q-Q plot is shown on the left, and the P-value histogram is shown on the right.