Association of Habitual Alcohol Intake With Risk of Cardiovascular Disease

Key Points Question What is the risk of cardiovascular disease associated with different amounts of habitual alcohol consumption? Findings In this cohort study of 371 463 individuals, genetic evidence supported a nonlinear, consistently risk-increasing association between all amounts of alcohol consumption and both hypertension and coronary artery disease, with modest increases in risk with light alcohol intake and exponentially greater risk increases at higher levels of consumption. Meaning In this study, alcohol consumption at all levels was associated with increased risk of cardiovascular disease, but clinical and public health guidance around habitual alcohol use should account for the considerable differences in cardiovascular risk across different levels of alcohol consumption, even those within current guideline-recommended limits.

This supplemental material has been provided by the authors to give readers additional information about their work.

eMethods 4. Continuous Variables
For each trait, means and statistical differences were calculated after removing unreported values for the individual category. The variables were collected as follows. BMI was calculated from measured height and weight at assessment. Smoking was a self-reported categorical variable, defined as never smoked (0), previously smoked (1), or current smoker (2). Physical activity was a self-reported variable representing the average number of days per week during which the participant spent more than 10 minutes doing moderate physical activity. Vegetable intake was a self-reported variable representing the average number of heaped tablespoons of cooked vegetables that a subject would eat per day. Red meat was another self-reported variable calculated as the sum of the participant's frequency of eating beef, lamb, or pork. Eating frequency was labeled as (0) never, (1) less than once a week, (2) once a week, (3) 2-4 times per week, (4) 5-6 times per week, or (5) once or more per day. Overall health was a selfreported categorical variable of the participant's rating of their own overall health defined as (1) excellent, (2) good, (3) fair, or (4) poor. BMI, biomarkers, and blood pressure were normalized to follow a normal distribution in traditional genetic analyses (analyses in which the focus was testing for significant association), but not for nonlinear genetic analyses (in which the focus was on clinically interpretable findings).

eMethods 5. Mass General Brigham Biobank
The Mass General Brigham Biobank -a patient-based cohort based in the United States -was used as a secondary cohort to replicate select genetic analyses primarily conducted in the UK Biobank. 7 The biobank consisted of 30,716 individuals with genetic data, 14,412 of whom had self-reported alcohol consumption information and 28,179 of whom had blood pressure measurements. If a participant had more than one blood pressure measurement, the maximum value was used in analyses. Participants were asked to group their alcohol consumption patterns into categorizations ranging from "None, or less than 1 drink per month" to "More than 6 drinks per day". In order to maximize power for analyses conducted in Mass General Brigham, only the primary genetic score (AUD-R) was used as the genetic instrument, and only continuous blood pressure measurements were used as the outcome. eMethods 6. Genetic Instruments Genetic instruments were constructed using externally-derived summary statistics from a genome-wide association study (GWAS) of alcohol consumption in 274 424 subjects from the Million Veteran Program, focusing on analyses conducted in Eastern Europeans to reflect the UK Biobank study population. 8 This study assessed genetic associations with Alcohol Use Disorder (AUD) and Alcohol Use Disorder Identification Test-Consumption (AUDIT-C), each used as a proxy for alcohol use. AUD is a medical diagnosis of severe overdrinking. AUDIT-C is a screening questionnaire meant to identify hazardous alcohol use; it is a three-question exam scored from 0-12 that asks (a) "how often do you have a drink containing alcohol?", (b) "how many standard drinks do you have on a typical day", and (c) "how often do you have six or more drinks on one occasion?". 9 From all significant loci for AUD (n=13), only independent SNPs after conditional analyses were chosen, with insertion/deletion polymorphisms removed (n=9 total SNPs remaining). From all significant loci in the AUDIT-C GWAS (n=19), only independent SNPs after conditional analyses were chosen; no insertion/deletion polymorphisms were among the lead associations (n=13 total SNPs remaining).
Mendelian randomization assumes that a genetic instrument influences an exposure, and that the corresponding change in the exposure is the only way by which the instrument affects the outcome. Therefore, refined SNP lists were created by removing SNPs that had any significant associations with tested lifestyle or risk factors in lifelong abstainers. In lifelong abstainers -a population in which no one had ever consumed alcohol, and therefore associations could not be mediated by differential alcohol intake -any associations would therefore be due to pleiotropy in the instrument. Accordingly, insignificant associations with lifestyle/risk factors in lifelong abstainers was taken to demonstrate a lack of pleiotropy. The tested potential confounders were smoking, BMI, physical activity, vegetable intake, red meat intake, overall health rating, C-Reactive Protein, and total cholesterol. For the AUD SNP list (n=9), the Bonferroni p-value was Bonferroni p = 0.05/(9 SNPs * 8 confounders)=6.94E-04. For the AUDIT-C SNP list (n=13), the Bonferroni p-value was Bonferroni p = 0.05/(13 SNPs * 8 confounders)=4.801E-04. Four SNPs from the AUD and three SNPs from the AUDIT-C instruments were significantly associated with confounders and consequently removed for the refined SNP lists. From the AUD instrument, rs1260326 (GCKR), rs570436 (SIX3), rs13107325 (SLC39A8), and rs11075992 (FTO) were removed to create the AUD-R instrument. From the AUDIT-C instrument, rs1260326 (GCKR), rs13107325 (SLC39A8), and rs9937709 (FTO) were removed to create the AUDIT-C-R instrument. The remaining AUD-R SNP list contained 5 SNPs, and the remaining AUDIT-C-R SNP list contained 10 SNPs.
Aggregated allele scores were constructed using PLINK software v2.0. Allele scores were normalized and then standardized to a 1-drink/day (7 standard drinks/week) increase using R software. Associations with allele scores and confounders, calculated using linear regression models in lifelong abstainers, were used to assess for pleiotropic effect in associations, and results are available in eTables 5-6. Genetic risk scores were validated for association with the exposure by regressing several alcohol phenotypes--both log-transformed and untransformed weekly alcohol intake, drinking group, current drinking status, and over limits drinking status--onto the score. Individual drinks, such as beer intake or wine intake, were also regressed onto the score for validation. For all continuous variables, linear regression was used. For all dichotomous variables, logistic regression was used. In this study, all regression models for the gene scores were adjusted for age at assessment, sex, genotyping array, and principal components 1-10. Genotyping array represented a dummy variable to account for the specific genotyping platform used for each subject (Affymetrix UK BiLEVE Axiom array or the Affymetrix UK Biobank Axiom® array). Adjusting for principal components accounts for population stratification according to ancestral background. 10 While both AUD-R and AUDIT-C-R scores were not associated with confounders in lifelong abstainers, 2SMR analyses between the AUDIT-C-R score and CVD phenotypes revealed significant MR-Egger intercepts for multiple associations (eTable S9), whereas these intercepts were insignificant when the AUD-R score was used (Table 2); MR-PRESSO tests for horizontal pleiotropy were also inflated for the AUDIT-C-R, but not AUD-R, instrument (eTable S10). As noted below (Two-Sample MR Analyses section), MR-Egger intercepts are used to assess for pleiotropic effects in two sample MR analyses. Two sample MR analyses determine the association between exposure and outcome by assessing for a linear relationship between, for each SNP, association dosage with exposure and association dosage with the outcome. A significant y-intercept -assessed via MR-Egger analysesimplies that there is an association with the outcome that is independent of the exposure; in other words, some of the association is not mediated by alcohol intake. Nominally significant MR-Egger intercepts using AUDIT-C-R suggested a potential bias in the instrument. 11 One source for potential residual pleiotropy in the AUDIT-C-R instrument could have been the SNPs rs4794018 and rs35572189, which both reached nominal, but not Bonferronilevel, significance for associations with BMI in lifelong abstainers (both p = 0.002). Given these potential biases that were present in the AUDIT-C-R instrument but not in the AUD-R instrument, the AUD-R instrument was selected to be the primary instrument for this study, with the AUDIT-C-R used for secondary analyses.
Using the AUD-R primary instrument, a test for trend indicated that the genetic association with alcohol was not significantly different in categories of alcohol consumption (p=0.311), and genetically predicted alcohol values ranged from 1.65 to 14.08 drinks/week.

eMethods 7. Observational Analysis
We focused on six cardiovascular disease phenotypes: hypertension, coronary artery disease, myocardial infarction, stroke, heart failure, and atrial fibrillation. We assessed the prevalence and hazards of cardiovascular diseases within each drinking group, the latter estimated by cox proportional hazards using abstainers as reference and incident disease as the outcome. Analyses were conducted using the cox proportional hazards function in R from the 'Survival' package. Abstainers who reported formerly drinking (n=12,977) were removed from analyses as they may exhibit the residual health effects of alcohol. We then evaluated other behavioral and lifestyle factors by drinking category to assess whether light to moderate alcohol consumption correlates with a healthier overall lifestyle. Six specific lifestyle measures were assessed--smoking frequency, normalized BMI, self-reported physical activity, cooked vegetable intake, red meat consumption, and self-reported health. Though BMI can be influenced by alcohol, this study followed the precedent set by others that have labeled BMI as a lifestyle factor rather than an outcome. 12 13 Adjusting for these six lifestyle factors, we re-estimated hazards of CVD by drinking group to assess for a change in the shape of observational associations due to possible confounding. In secondary analyses, we reran adjusted models without self-reported health, and also trialed sex-stratified models. Secondary analyses were performed for hypertension only, the most prevalent of all cardiovascular phenotypes assessed. eMethods 8. Two-Sample MR Analyses Two sample Mendelian randomization was conducted using the R packages 'Mendelian Randomization' and 'TwoSampleMR'. Summary statistics from an alcohol consumption GWAS in MVP (Kranzler et. al) provided beta coefficients and standard errors between each SNP and the exposure. 8 Regression models of each SNP in the gene score with the outcome of interest were used to calculate the beta coefficient and standard error between each SNP and the exposure. These datasets were combined and used for the two sample Mendelian randomization.
Inverse variance weighted (IVW) meta-analyses of the SNP-specific associations with the exposure and outcome provide a weighted average of the slope estimates from the origin to each point. For each SNP, the association with the exposure (alcohol) and the association with the outcome (CVD) are compared. If the exposure is causally associated with a greater risk of the outcome, SNPs associated with a greater change in exposure should be associated with a greater risk in the outcome (proportionality/linearity of exposure-associations and outcomeassociations for each SNP). IVW tests the significance of the relationship between the two associations and provides an estimate of the effect of the exposure on the outcome. Weighted median analyses, which use similar calculations with only the median SNPs, provide an estimate that is less likely to be biased by outlier SNPs. MR-Egger analyses, which are similar to previous analyses but allow for a non-zero intercept in analysis, were used to check for pleiotropy. IVW and weighted median methods rely on the assumption that there is no y-intercept; in other words, that the relationship with the outcome of each SNP in the dataset is directly proportional to its association with the exposure, and that there is no inflation due to associations with confounders or the outcome itself. MR Egger models allow for a non-zero intercept in order to check if the pleiotropy condition is being violated. MR-PRESSO analyses, conducted using the MRPRESSO package in R, were employed to assess and correct for horizontal pleiotropy by removing outliers. Associations using these methods were primarily tested using outcome statistics in current drinkers and checked for pleiotropy by calculating outcome statistics in nondrinkers.
Using traditional two-sample MR methods to assess for potential causal associations, we investigated genetic associations of alcohol consumption with the six aforementioned cardiovascular diseases, and also with ten continuous traits: systolic and diastolic blood pressure, LDL cholesterol, HDL cholesterol, total cholesterol, triglycerides, apolipoproteins A and B, gamma-glutamyl transferase, and C reactive protein. For each instrumentoutcome association, we pursued an IVW random-effects meta-analysis of the effect of each SNP on the outcomes divided by the effect of the same SNP on alcohol consumption. We also pursued weighted median and MR-Egger regression analyses to address potential invalid instruments and directional pleiotropy. For continuous traits, we considered significant any association surpassing a Bonferroni-corrected threshold of p < 0.005 [0.05/10 traits], and for cardiovascular diseases, a threshold of p < 0.008 [0.05/6 diseases]. eMethods 9. Allele Score Analyses Given individual-level data in the UK Biobank, we also constructed externally weighted allele scores for each participant by multiplying the dosage of the allele for increased alcohol consumption by the variant's reported beta coefficient from the Million Veteran Program discovery GWAS and summing across all variants. We then used logistic and linear regression to test for associations with the six aforementioned cardiovascular diseases and a full complement of 31 continuous traits (including those from 2SMR). Regression models were run in current alcohol users and adjusted for age, sex, top 10 principle components of ancestry and genotyping array. Regression models were also run in all subjects to verify associations, and in lifelong abstainers to check for pleiotropy. Statistical significance for the disease and continuous phenotypes were denoted as Bonferroni-corrected threshold of p < 0.008 [0.05/6 disease phenotypes] and p < 0.002 [0.05/31 traits], respectively. eMethods 10. Nonlinear MR Whereas Mendelian randomization has been employed to assess for a potential causal association in the absence of a randomized controlled trial, traditional approaches often presume linearity. 14,15 A non-linear model may be fit to the data, but -for a non-linear association to be detected -instrumental variables must capture an appreciable portion of the exposure, which genetic instruments often fail to do; consequently, recent studies have employed unique approaches to increase the variance explained by an instrument. For example, to mitigate these limitations, one study utilized a combination of genetics and study area (in conjunction with sex-specific differences in alcohol intake) as proxies for alcohol intake, but still arrived at linear relationships between alcohol intake and cardiovascular disease risk. 16 Here, we applied methods to formally assess for non-linear associations between alcohol intake and cardiovascular disease. To statistically test for non-linear associations -and assess for differential risks at different levels of intake -genetic associations with the outcome may be tested in quantiles of the exposure. However, the genetic instrument itself influences the exposure, subjecting the study to potential collider bias; by extension, stratified allele score analyses should be taken as secondary to more comprehensive non-linear Mendelian randomization. The methods outlined below test genetic associations in conditioned quantiles of residualized exposure (that is, reported alcohol intake minus the influence of the genetic instrument) in order to overcome these limitations and comprehensively assess potential non-linear effects. 14,15,17 To test the shape of each potential causal association, genetic associations were tested in deciles of residual alcohol intake, a measurement of the exposure devoid of the genetic instrumental variable (IV-free exposure). Specifically, residual alcohol intake was calculated by subtracting genetically predicted alcohol intake from reported alcohol intake (as done previously), and associations were tested in deciles of the residual alcohol intake variable. 17 Residualization was required before partitioning the cohort by amount of alcohol intake in order to avoid overadjustment and collider biases, as the genetic score affects reported alcohol intake. Instead, by stratifying on residual alcohol intake, which defines a participant's alcohol intake if all participants had the same genotype, no such biases are introduced as this variable is unaffected by the genetic score.
Residual alcohol intake was split into deciles -in order to maximize power for stratified analyses -and any outlying values >Q3+1.5*IQR were removed. The association between the genetic score and each cardiovascular outcome was tested within deciles of residual alcohol intake using linear or logistic regression adjusting for genotyping array, sex, age at assessment, and principal components 1-10. In order to standardize effects to a 1 drink per day increase, the decile-specific regression coefficient for the genetic score and the outcome was divided by the regression coefficient for the genetic score and the exposure (alcohol). The resultant association estimate -based on this standardized ratio of coefficients -was referred to as a localized average causal effect (LACE) estimate, as per Staley et al. 14 Interval-specific LACE values were determined and then subsequently used to reconstruct the overall association between alcohol and each tested cardiovascular phenotype using either the "piecewise linear" or "fractional polynomial" method, as described below. In both cases, 0 standard drinks per week was used as the reference value, and abstainers who were former drinkers were excluded as they would be grouped in low categories of intake but could still demonstrate effects of increased alcohol intake. Because all non-linear comparisons are conducted within separate strata of residualized alcohol intake, it is encouraged to focus on the relative slopes of the association rather than absolute risks across the range of the exposure. 13 For piecewise non-linear MR, the association between alcohol and each outcome -reported as the relative beta coefficient (for continuous traits) or relative odds ratio (for dichotomous traits) -was reconstructed using LACE estimates as the gradient for the corresponding value of reported alcohol intake. LACE values reflect the change in relative beta coefficient or odds ratio for every 1 drink per week increase within each strata of residual alcohol intake. A trend test indicated that exposure associations did not significantly vary in strata of alcohol intake -except for the top and bottom two quantiles -and so a constant exposure association estimate was used in all strata, as in previous studies. 17 For fractional polynomial non-linear MR, the strata-specific relationships between mean alcohol intake and LACE values were meta-analyzed. Using the standard powers for fitting fractional polynomial models, −2, −1, −0.5, 0 (log function), 0.5, 1, 2, and 3, the best fit models of both degree 1 and 2 were chosen. 14,18 The cutoff for choosing the best fitting model of degree 2 over that of degree 1 was p=0.05, as judged by the maximum likelihood test; however, even best fitting polynomials of degree 2 demonstrated consistently risk increasing estimates for all primary analyses. The association between the exposure (alcohol) and the outcome was then mathematically reconstructed as the derivative of the fractional polynomial model. Code was derived from https://github.com/jrs95/nlmr/tree/master/R, as validated and published previously. 14,17 This adapted code was also used to calculate several tests to assess fit for disease phenotypes. Three tests of nonlinearity were reported: fractional polynomial non-linearity tests, quadratic tests, and Cochran Q tests. 14 The fractional polynomial test involves testing the best fit model against a linear model to determine whether a nonlinear model better fits the data. The quadratic test between exposure and outcome is the same as a trend test between exposure and LACE values -i.e. determining if genetic associations vary in different strata of the exposure by meta-regressing LACE values against mean exposure in each strata.A heterogeneity test using Cochran's Q statistic determines whether the difference in LACE values is more than expected by chance. Powers of best-fit disease models (p1 for models of degree 1, and p1 and p2 for models of degree 2) were also recorded with corresponding p-values of fit.
Non-linear MR analyses were performed using the AUD-R allele score as the primary instrument to determine the shape of genetic associations with disease. Secondary analyses were conducted using the AUDIT-C-R allele score as well as a single-SNP instrument comprising the number of alcohol-increasing alleles of rs1229984 at ADH1B (a well-established gene for which association with alcohol is known to be mediated by alcohol dehydrogenase). Nonlinear MR methods were also used to assess potential causal associations with the ten continuous traits tested with 2SMR techniques, calculating relative beta coefficients rather than odds ratios. Non-linear MR analyses were focused on those disease phenotypes and continuous traits with strong evidence of association with alcohol intake from 2SMR analyses (specifically, Bonferroni significant IVW association, strong weighted median and MR-Egger associations, and no evidence of pleiotropy from MR-Egger intercept or IVW association in lifelong abstainers): hypertension, CAD, systolic and diastolic blood pressure, LDL cholesterol, and gamma glutamyltransferase. Furthermore, as an important sensitivity analysis, we reapplied primary analyses excluding abstainers, as done in previous epidemiological studies. 19 As previously noted, the genetic instrument is not associated with alcohol consumption in this population, because these individuals do not drink alcohol; nonetheless, abstainers have been previously shown to be an overall healthier population than light to moderate drinkers. Thus, in non-linear Mendelian randomization analyses, the genetic instrument may not as accurately reflect the differential alcohol consumption of abstainers vs. light drinkers and therefore non-linear Mendelian randomization results at low levels of intake could be biased by the inclusion of this abstainer population. We also pursued, as a sensitivity analysis, non-linear Mendelian randomization of medication-corrected blood pressure readings, following previously outlined methods. 20 Lastly, to further assess for residual pleiotropy, and given prior evidence of genetic correlation between alcohol use and smoking initiation, BMI, and depression we conducted multivariable non-linear Mendelian randomization as outlined previously. [21][22][23] Genetic instruments for BMI and smoking were both derived from GWAS external to the study populations; the genetic instrument for depression was constructed from a meta-analysis that included the UK Biobank, but we included only SNPs that also reached genome-wide significance in externally replication. [24][25][26] In calculating strata-specific estimates for alcohol use, we adjusted for genetically-proxied BMI and smoking; these estimates were then integrated into non-linearity assessments as outlined above.      (I) Genetic associations between AUD-R genetic risk score and heart disease phenotypes in (A) light drinkers, (B) moderate drinkers, (C) heavy drinkers, and (D) abusive drinkers. Associations were determined using logistic regression models adjusting for age at assessment, sex, genotyping array, and principle components 1-10.
(II) Genetic associations between AUDIT-C-R genetic risk score and heart disease phenotypes in (A) light drinkers, (B) moderate drinkers, (C) heavy drinkers, and (D) abusive drinkers. Associations were determined using logistic regression models adjusting for age at assessment, sex, genotyping array, and principle components 1-10. eFigure 6. Fractional Polynomial Nonlinear MR Analyses, Using AUD-R Genetic Instruments, of Alcohol and Secondary Cardiovascular Disease Phenotypes LACE values were meta-regressed against mean consumption in each strata of alcohol, and these plots were reconstructed as the derivative of the best fit model. Shaded areas denote 95% confidence intervals for the model. I) In all individuals