Educational Attainment and Lifestyle Risk Factors Associated With All-Cause Mortality in the US

Key Points Question To what extent can the association between socioeconomic status (SES) and mortality be explained by differential exposure to lifestyle factors (such that unhealthy lifestyle factors are more prevalent in groups with lower SES) and differential vulnerability to lifestyle factors (such that the same exposure to unhealthy lifestyle factors is associated with more deleterious outcomes in groups with lower SES)? Findings In this nationwide cohort study of 415 764 US adults, a mediation analysis showed that lifestyle factors explained 66% (men) and 80% (women) of the association between educational attainment and all-cause mortality. Inequalities in mortality were primarily a result of greater exposure and clustering of unhealthy lifestyle factors among groups with lower educational attainment; with some exception, there was little evidence for differential vulnerability to lifestyle factors. Meaning Public health interventions to create equality in the socioenvironmental contexts that shape lifestyle factors and to reduce exposure to lifestyle risk factors among groups with low SES have the potential to significantly increase life expectancy and reduce socioeconomic inequalities in mortality.

Assuming 14 grams of pure alcohol per standard drink, these categories of alcohol use are equivalent to: category I (up to 10 (women) or 20 (men) drinks per week), category II (10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20) (women) or 20-30 (men) drinks per week), and category III (>20 (women) or >30 (men) drinks per week). Category I drinking behavior was used as the reference category, given that this group comprised of the majority of the population and because never drinkers may have poorer health outcomes. 1 Lastly, as a sensitivity analysis, alcohol use was indexed using heavy episodic drinking (HED) based on the number of heavy drinking days (≥5 drinks a day) in the past 12 months. This indicator is not a measure of "binge drinking," generally defined as 5+ drinks (sometimes 4+ for women) on a single drinking occasion. HED was categorized into four categories based on the number of heavy drinking days in the past 12 months: no HED (0 heavy drinking days), HED less than once a month (1-11 heavy drinking days), HED at least once a month but less than once a week (12-51 heavy drinking days), and HED once a week or more (52-365 heavy drinking days).
With respect to smoking, participants were asked to report whether they 1) have smoked at least 100 cigarettes over their entire life, and 2) whether they currently smoke.
Smoking was categorized as never smokers (reference category), former smokers, current some day smokers, and current everyday smokers.
With respect to physical activity, participants were asked to report 1) how often they performed a) vigorous or b) light-moderate leisure-time physical activities of at least 10 minutes that caused a) heavy sweating or large increases in breathing or heart rate or b) only light sweating or a slight to moderate increase in breathing or heart rate, and 2) how long they do those a) vigorous or b) light or moderate leisure-time physical activities each time. No timeframe (e.g., over the past year, or past month) was specified for either question. The length of physical activity per week was calculated and combined, assuming that 1 minute of vigorous physical activity is equivalent to 2 minutes of moderate physical activity 2 . Physical activity was categorized using WHO recommendations of 150-300 minutes of moderate-intensity physical activity per week 3 .

Statistical Analyses
To evaluate the interaction (joint effects) of education and lifestyle risk factors on mortality (Objective 1), Aalen's additive hazard models were used to directly estimate additive interaction 4,5 . Additive interaction was the focus given that it is of greater importance for public health 4 . An additional benefit is that the coefficients obtained from Aalen models are collapsible, unlike those obtained from Cox models; 6 that is, one can meaningfully compare parameter estimates from Aalen models with different covariates, but should not do so for Cox models since the addition of a covariate could shift the baseline hazard rather than simply altering the slops of the hazard function. The hazard of all-cause mortality for person i at age t was modeled as a linear function of the exposure (E; education), the lifestyle risk factor (M), their interaction (E x M), covariates (C), and an unspecified baseline hazard (λ 0 ): With regard to interpretation, α 3 directly estimates the number of additional events per person year at risk due to their additive interaction. This semiparametric model is flexible and can incorporate time-varying covariate effects (i.e., β(t), where the effect of the covariate is not constant over time). Each lifestyle risk factor was evaluated one at a time, and models were adjusted for age (used as the time scale), and the following categorical variables: race/ethnicity, marital status, and survey year. Separate models were estimated for men and women given that sex has been suggested to be an effect modifier of socioeconomic inequalities on all-cause mortality 7 . The graphical techniques and tests described by Scheike and Martinussen 5 suggested that race/ethnicity, smoking, and physical activity should be modeled as age-varying effects; this is equivalent to a violating the proportional hazards assumption in Cox models.
Aalen models are flexible and the effects of race/ethnicity were included in the model as age-varying. Sensitivity analyses by age subgroups (where the age-invariant assumption was met) were used to examine the impact of modeling smoking and physical activity as age-invariant.
The marginal structural approach described by Lange et al. [8][9][10] was used to evaluate the extent to which lifestyle risk factors mediated the relationship between education and mortality (Objective 2). Briefly, this flexible approach uses a counterfactual framework and allows for the direct parameterization of natural direct and indirect effects, multiple mediators, and exposure-mediator interactions. The total effect of education on mortality was decomposed into three components: the average pure direct effect, the average pure indirect effect through each mediator (indicating differential exposure), and the average effect of the mediated interaction between education and each mediator (indicating differential vulnerability). The proportion of the total effect mediated by each lifestyle risk factor was also calculated. We fit an additive hazard model including all lifestyle risk factors (alcohol use, smoking, BMI, physical activity) and covariates (age [used as the time scale], race/ethnicity, marital status, and survey year), and fit separate models for men and women. Robust standard errors were not used given the size of the sample and computational limitations, despite the fact that the analyses were conducted on a specialized computing cluster. 5 All analyses were completed in R 3.6.3, using the timereg package (version 1.9.8) 5 . The timereg package does not allow for complex sampling designs and survey weights were not utilized.

Statistical Code
The statistical code for this manuscript is publicly available at https://github.com/kpuka/SIMAH_clean/tree/main/Puka_2022_SES_x_Lifestyles

Sensitivity Analyses
As sensitivity analyses, the analyses described above were repeated with small modifications.
First, alcohol use was indexed using heavy episodic drinking (HED) based on the number of heavy drinking days (≥5 drinks a day) in the past 12 months (eTables 2 to 4). Second, analyses were stratified analyses by age group, to evaluate the impact of modeling smoking and physical eTable 1. Characteristics at Baseline among participants with complete and missing data.  The model adjusted for age (as timescale), race/ethnicity, marital status, and survey year; for simplicity, only the effect of low education (relative to high education) is presented. CI: confidence interval. a Proportion mediated is the ratio between the effect and the total effect x 100  The model adjusted for age (as timescale), race/ethnicity, marital status, and survey year; for simplicity, only the effect of low education (relative to high education) is presented.CI: confidence interval. a Proportion mediated is the ratio between the effect and the total effect x 100 2 (0, 4) BMI: differential vulnerability -0.5 (-2.0, 0.9) -1 (-3, 1) Physical activity: differential exposure 14.7 (13.5, 15.9) 22 (20, 24) Physical activity: differential vulnerability 4.4 (2.9, 5.9) 7 (4,9) The model adjusted for age (as timescale), race/ethnicity, marital status, and survey year; for simplicity, only the effect of low education (relative to high education) is presented. CI: confidence interval. a Proportion mediated is the ratio between the effect and the total effect x 100