Association of Healthful Plant-based Diet Adherence With Risk of Mortality and Major Chronic Diseases Among Adults in the UK

Key Points Question Is adherence to a healthful plant-based diet associated with a lower risk of mortality and chronic disease among UK adults? Findings In this cohort study with 126 394 UK Biobank participants, greater adherence to a healthful plant-based diet was associated with a lower risk of mortality, cancer, and particularly cardiovascular disease. Opposing associations with higher risk were observed for individuals who adhered to an unhealthy plant-based diet. Meaning The findings of this study suggest that a healthful plant-based diet that is low in animal foods, sugary drinks, snacks and desserts, refined grains, potatoes, and fruit juices was associated with a lower risk of mortality and major chronic diseases among adults in the UK.

The Oxford WebQ dietary questionnaire was issued at least once to UK Biobank study participants between 2009 and 2012 (n=210,965) on five separate occasions 4 . For this study, participants who completed a minimum of ≥ 2 diet recalls were considered eligible. The Oxford WebQ asked about the frequency of consumption of up to 206 types of food and 32 drinks over the past 24-hours 5 . The Oxford WebQ has been validated previously to represent approximate and true dietary intake 6,7 .

Sex-specific factors
Data on female reproductive factors were collected at baseline via touchscreen questionnaire. Women were asked questions on age at menarche/parity/menopause, menopausal status, use of exogenous hormones and pregnancy history.

Age at menarche
To establish age at menarche, female participants were asked "How old were you when your periods started?". Following this, participants were provided with the option to answer with their age at menarche, or "do not know" or "prefer not to answer". Women were then categorised into age groups: ≤12 years, 13 years old, or ≥14 years old. Those who answered with "do not know" or "prefer not to answer" were categorised as 'unknown'.

Parity and age at first live birth
To categorise women on parity, women were asked to answer the question, "How many children have you given birth to?". If answered with "1+", the participant was prompted to answer the question, "How old were you when you had your first child?". From this, women were categorised into the number of children birthed, alongside their age at first birth or "do not remember" or "prefer not to answer". For my analysis, women were categorised into age groups: ≤25, 25-29.9 or 30+. Those who responded, "do not remember" or "prefer not to answer" were coded as 'unknown'. Pre-menopausal, post-menopausal and unknown menopausal status coding criteria (combination of questions/responses, option two) c Pre-menopausal, post-menopausal and unknown menopausal status coding criteria (combination of questions/responses, option three)

eMethods 3. Outcome Ascertainment
Outcome -Mortality All-cause, CVD and cancer mortality was defined by date and underlying cause of death using the International Classification of Disease 10 th edition . The ICD-10 codes used to define CVD and cancer mortality included: fatal CVD (I00-I25, I27-I88, I95-I99); fatal cancer (C00-C97, excluding nonmelanoma skin cancer: C44). Censoring dates for death data were provided by the NHS Information Centre for participants from England and Wales until September 2021, and the NHS Central Register Scotland for Scotland until October 2021 8 . For mortality analysis, follow-up was censored using the death registry censoring date or death date, whichever occurred first.

Outcome -CVD
Incident CVD was defined as a primary myocardial infarction (MI) or stroke event using the UK Biobank linked Hospital Inpatient and Death Registry data. For incident CVD analysis, follow-up time was censored at date of hospitalisation, death, or end of follow-up, whichever occurred first. Hospital admission data was available up until September 2021 from the Hospital Episode Statistics (HES) for England, until the end of July 2021 for Scottish Morbidity Records (SMR), and until March 2016 for the Patient Episode Database for Wales (PEDW). Using the ICD-10, any CVD was defined as ischemic heart disease (IHD) (I20), MI (I21-I23, I24.1, I25.2), stroke (I60, I61, I63, or I64). Stroke subtypes were defined as ischemic (I63) and haemorrhagic (I61).

Outcome -Cancer
Incident cancer was defined as a primary cancer diagnosis. Cancer diagnosis data was provided through record linkage to National Cancer Registries in England, Wales (follow-up data available from the National Health Service (NHS) information centre until February 2020) and Scotland (follow-up data available from the NHS Central Register Scotland until January 2021). Participants contributed to follow-up time in person years from date of recruitment until the date of first cancer registration, death, or end of followup, whichever occurred first. Where Cancer Registry censoring dates did not extend past Hospital Inpatient dates (England and Scotland), participants were further followed using HES and SMR data. PEDW data did not extend past the Cancer Registry data for Wales, so only Cancer Registry dates were used for censoring the end of follow-up. Using ICD-10, cancers were defined as, any cancer (C00-C97, excluding non-melanoma skin cancer: C44), breast cancer (C50), colorectal cancer (C18-C20), and prostate cancer (C61). Breast cancer analyses were coded and restricted to postmenopausal breast cancer.
Fractures of the face, skull, hands, and feet were excluded from this analysis due to their typically traumatic nature, whilst other traumatic fractures could not be excluded due to a lack of ICD-10 information on the cause of trauma 10 . Hospital admission data was available up until September 2021 from the HES for England, until the end of July 2021 for SMR, and up until March 2016 from the PEDW. For incident fracture analysis, follow-up time was censored at the date of hospitalisation or first reported occurrence, death, or end of follow-up, whichever occurred first.

eMethods 4. Multivariable Adjustment
Minimally adjusted models were stratified by region (London, North-West England, North-East England, Yorkshire, West Midlands, East Midlands, South-East England, South-West England, Scotland and Wales), and were adjusted for education (CSEs or equivalent, O levels/GCSEs or equivalent; Medium: A levels/AS levels or equivalent, NVQ or HND or HNC or equivalent; High: College or University degree, Other professional qualifications e.g.: nursing, teaching or Missing/Prefer not to say/Unknown) and sex (male and female).
Cox regression analyses on total cancer risk were further adjusted for female-specific factors: menopausal hormone therapy use (no, yes, or unknown/missing among women; or male) and menopausal status at recruitment (premenopausal, postmenopausal, or unknown/missing among women; or male). Analyses on postmenopausal breast cancer risk were further adjusted for age at menarche (≤12, 13, ≥14 years old, or unknown/missing), ever use of oral contraception (no, yes, or unknown/missing), age at first live birth (<25, 25-29.9, ≥30 years of age, or unknown/missing) and polygenic risk score (PRS) for breast cancer (tertiles from low to high PRS for breast cancer or missing), as provided by the UK Biobank. 3  cancer or missing). Analyses on prostate cancer analyses were further adjusted for prostate cancer PRS (tertiles from low to high PRS for prostate cancer or missing), but not for female-specific factors. 3 For total CVD, MI and stroke analyses, multivariable models were further adjusted for a CVD PRS, ischaemic stroke (IS) PRS, or coronary artery disease (CAD) PRS (tertiles from low to high PRS for CVD/ ISS/CAD or missing). 3 Finally, multivariable Cox regression analyses on fracture were further adjusted for an osteoporosis (OP) PRS (tertiles from low to high PRS for OP, or missing) 3  Hazard Ratios with 95% Confidence Intervals (CI), adjusted for sex (excluding breast and prostate cancer analyses), BMI, ethnicity, physical activity, smoking status, alcohol intake, education, energy intake, polypharmacy index, multimorbidity index and aspirin use; stratified by region. For any cancer analyses, models were further adjusted for menopausal status and use of MHT. For breast cancer analyses, models were restricted to postmenopausal breast cancer cases and were further adjusted for use of MHT, use of oral contraception, PRS (BC), age at menarche and age at first live birth. For colorectal cancer analyses, models were further adjusted for menopausal status, PRS (CRC) and MHT. For prostate cancer analyses, models were further adjusted for PRS (PC Hazard Ratios with 95% Confidence Intervals (CI), adjusted for sex (excluding breast and prostate cancer analyses), BMI, ethnicity, physical activity, smoking status, alcohol intake, education, energy intake, polypharmacy index, multimorbidity index and aspirin use; stratified by region. For any cancer analyses, models were further adjusted for menopausal status and use of MHT. For breast cancer analyses, models were restricted to postmenopausal breast cancer cases and were further adjusted for use of MHT, use of oral contraception, PRS (BC), age at menarche and age at first live birth. For colorectal cancer analyses, models were further adjusted for menopausal status, PRS (CRC) and MHT. For prostate cancer analyses, models were further adjusted for PRS (PC Hazard Ratios with 95% Confidence Intervals (CI), adjusted for sex, BMI, ethnicity, physical activity, smoking status, alcohol intake, education, energy intake, polypharmacy index, multimorbidity index and aspirin use; stratified by region. For any CVD and haemorrhagic stroke, models were further adjusted for PRS (CVD). For ischaemic stroke analyses, models were further adjusted for PRS (IS). For MI, models were further adjusted for PRS (CAD Hazard Ratios with 95% Confidence Intervals (CI), adjusted for sex, BMI, ethnicity, physical activity, smoking status, alcohol intake, education, energy intake, polypharmacy index, multimorbidity index and aspirin use; stratified by region. For any CVD and haemorrhagic stroke, models were further adjusted for PRS (CVD). For ischaemic stroke analyses, models were further adjusted for PRS (IS). For MI, models were further adjusted for PRS (CAD   Analyses used age as the underlying time variable and were adjusted for sex (excluding subgroup analysis), BMI (excluding subgroup analysis), ethnicity, physical activity, smoking status (excluding subgroup analysis), alcohol intake, education (excluding subgroup analysis), energy intake, polypharmacy index, multimorbidity index and aspirin use; stratified by region. For all-cause mortality analyses, models were further adjusted for prevalent CVD and prevalent cancer. For CVD mortality analyses, models were further adjusted for prevalent CVD. For cancer mortality analyses, models were further adjusted for prevalent cancer. Heterogeneity was tested by comparing two models -one without an interaction term between subgroup of interest and hPDI (categorical), with a model that included an interaction term. The likelihood ratio test was used to produce p-interaction values. Abbreviations: CVD, cardiovascular disease; BMI, Body Mass Index; HR, hazard ratios; CI, confidence intervals. eFigure 6. Hazard Ratios (95% CIs) of Cancer Among UK Biobank Subgroups, With Healthful Plant-based Diet Score Modeled as a Continuous Trend  Analyses used age as the underlying time variable and were adjusted for sex (excluding breast and prostate cancer analyses and in subgroup analysis), BMI (excluding subgroup analysis), ethnicity, physical activity, smoking status (excluding subgroup analysis), alcohol intake, education (excluding subgroup analysis), energy intake, polypharmacy index, multimorbidity index and aspirin use; stratified by region. For any cancer analyses, models were further adjusted for menopausal status and use of MHT. For breast cancer analyses, models were restricted to postmenopausal breast cancer cases and were further adjusted for use of MHT, use of oral contraception, PRS (BC), age at menarche and age at first live birth. For colorectal cancer analyses, models were further adjusted for menopausal status, PRS (CRC) and MHT. For prostate cancer analyses, models were further adjusted for PRS (PC). Heterogeneity was tested by comparing two models -one without an interaction term between subgroup of interest and hPDI (categorical), with a model that included an interaction term. The likelihood ratio test was used to produce p-interaction values. Analyses used age as the underlying time variable and were adjusted for sex (excluding subgroup analysis), BMI (excluding subgroup analysis), ethnicity, physical activity, smoking status (excluding subgroup analysis), alcohol intake, education (excluding subgroup analyses), energy intake, polypharmacy index, multimorbidity index and aspirin use; stratified by region. For any CVD and haemorrhagic stroke, models were further adjusted for PRS (CVD). For ischaemic stroke analyses, models were further adjusted for PRS (IS). For MI, models were further adjusted for PRS (CAD). Heterogeneity was tested by comparing two models -one without an interaction term between subgroup of interest and hPDI (categorical), with a model that included an interaction term. The likelihood ratio test was used to produce p-interaction values. Abbreviation: BMI, Body Mass Index; CVD, cardiovascular disease; PRS, polygenic risk score; IS, Ischaemic stroke; MI, myocardial infarction; CAD, coronary artery disease; HR, hazard ratios; CI, confidence intervals. eFigure 10. Hazard Ratios (95% CIs) of Fracture Across Strata of Genetic Osteoporosis Risk, With Healthful Plant-based Diet Score Modeled as a Continuous Trend  PRS were obtained from the UK Biobank Showcase for OP. Analyses used age as the underlying time variable and were adjusted for sex, BMI, ethnicity, physical activity, smoking status, alcohol intake, education, energy intake, vitamin/mineral supplement use, polypharmacy index, multimorbidity index and aspirin use; stratified by region. Heterogeneity was tested by comparing two models -one without an interaction term between subgroup of interest and hPDI (categorical), with a model that included an interaction term. The likelihood ratio test was used to produce p-interaction values. Abbreviations: PRS, polygenic risk score; OP, Osteoporosis; BMI, Body Mass Index; HR, hazard ratios; CI, confidence intervals. eFigure 11. Sensitivity Analyses Showing Hazard Ratios (95% CIs) Across Sex-Specific Healthful vs Unhealthful Plant-based Diet Index Quartiles, Removing the First 2 Years of Follow-up for Participants Who Completed 2 or More Dietary Assessments and the Associated Risk of All-Cause, Cancer, and Cardiovascular Disease Mortality Analyses used age as the underlying time variable and were adjusted for sex, BMI, ethnicity, physical activity, smoking status, alcohol intake, education, energy intake, polypharmacy index, multimorbidity index and aspirin use; stratified by region. For all-cause mortality analyses, models were further adjusted for prevalent CVD and prevalent cancer. For CVD mortality analyses, models were further adjusted for prevalent CVD. For cancer mortality analyses, models were further adjusted for prevalent cancer. Abbreviations: hPDI, healthful plant-based diet index; uPDI, unhealthful plant-based diet index; CVD, cardiovascular disease; BMI, Body Mass Index HR, hazard ratios; CI, confidence intervals. eFigure 15. Multivariable-Adjusted Hazard Ratios (95% CIs) of All-Cause Mortality (n = 126 217), Cancer (n = 117 569), Cardiovascular Disease (n = 123 134), and Fracture (n = 112 208) Among Ethnic Subgroups Across Sex-Specific Healthful vs Unhealthful Plant-based Diet Index Quartiles All models used age as the underlying time variable and were adjusted for sex, BMI, physical activity, smoking status, alcohol intake, education, energy intake, polypharmacy index, multimorbidity index and aspirin use; stratified by region. For all-cause mortality analyses, models were further adjusted for prevalent CVD and prevalent cancer. For any cancer analyses, models were further adjusted for menopausal status and use of MHT. For any CVD analyses, models were further adjusted for PRS (CVD). For any fracture analyses, models further adjusted for vitamin/mineral supplement use and PRS (OP). P-trend is for linear trend. Heterogeneity was tested by comparing two models -one without an interaction term between subgroup of interest and hPDI (categorical), with a model that included an interaction term. The likelihood ratio test was used to produce p-interaction values. Abbreviations: Q, quartile; hPDI, healthful plant-based diet index; uPDI, unhealthful plant-based diet index; CVD, Cardiovascular Disease; BMI, body mass index; PRS, polygenic risk score; MHT, menopause hormone therapy; OP, Osteoporosis; HR, hazard ratios; CI, confidence intervals.