Association of Body Mass Index and Age With Subsequent Breast Cancer Risk in Premenopausal Women

Key Points Question What is the association between body mass index and risk for breast cancer diagnosed before menopause? Finding In this large pooled analysis of data on 758 592 premenopausal women, an inverse association of breast cancer risk with body mass index at 18 through 54 years of age was found, most strongly for body mass index at ages 18 through 24 years. The inverse association was strongest for hormone receptor–positive breast cancer, was evident across the entire distribution of body mass index, and did not materially vary by attained age or other characteristics of women. Meaning Increased adiposity, in particular during early adulthood, may be associated with reductions in the risk of premenopausal breast cancer.


The collaboration
Full details of the Premenopausal Breast Cancer Collaborative Group have been published elsewhere. 1 Individual-level data were pooled from 19 prospective cohorts with 100 breast cancer cases diagnosed before age 55 years, with the collaboration facilitated by the National Cancer Institute Cohort Consortium. Data were harmonised to a common template for 1-16 questionnaire rounds per study; all studies had at least two rounds except for the European Prospective Investigation into Cancer and Nutrition (EPIC) study, the Canadian Study of Diet, Lifestyle and Health (CSDLH), and HUNT2 for which only the baseline questionnaire was available. One study (CSDLH) provided data for a case-cohort subset; all the others provided data for the full cohort. Seventeen studies provided information on incident invasive and in-situ breast cancer and two (HUNT2 and Canadian Study of Diet, Lifestyle, and Health, CLDLH) on invasive breast cancer only.The pooled dataset was used to construct a dataset to investigate the endpoint of premenopausal breast cancer.

Derivation of age at menopause and premenopausal follow-up time
All cohorts collected information on menopausal status of participants at one or more questionnaire rounds. Participants were asked whether they had had any menstrual periods during the previous 6 or 12 months, depending on study, and/or whether they believed their periods had stopped permanently. Participants were asked about the age at their last period and the reason their periods stopped. We used this information to construct premenopausal follow-up time for analysis. Age at menopause was computed for each participant based on (i) reported age at menopause or, if age was missing, (ii) age first known postmenopausal if under age 50, (iii) age last known premenopausal if over age 50 or (iv) age 50 if no information was provided. When a hysterectomy was reported as reason for the menopause follow-up was censored at the reported age of the procedure. Since women with breast cancer often become postmenopausal due to their treatment and breast cancers diagnosed in the year of their menopause could be considered aetiologically premenopausal, we lagged reported menopausal ages (subjects under i) for all women by +1 year, i.e. the year during which they reported that they had become postmenopausal was analysed as premenopausal follow-up time. As a sensitivity analysis, we repeated the main analyses including only known premenopausal follow-up time up to the age at reported menopause (subjects under i) or, if age at menopause was missing, the age at the last questionnaire when the participant reported she was premenopausal (ii and iii).

Computation of BMI at various ages
We used data on current weight at the time of questionnaire completion and on recalled weight at ages before questionnaire completion, to construct variables for weight within the age ranges 18-24, 25-34, 35-44, 45-54 years. None of the studies had information to calculate BMI at ages younger than 18 years. Most weights at ages 18-24 years were retrospectively reported (most often for ages 18-21) but a minority were concurrently reported by subjects who were recruited at ages 18-24 years. At ages 25-34, 35-44 and 45-54, the majority of weights were concurrently reported. When weights were assessed on multiple occasions within an age category we used the earliest concurrent weight or otherwise the retrospectively reported weight relative to the youngest age within the age group. We recoded the following extreme values to missing based on visual inspection of histograms and percentile distributions: height (<130 or >195 cm), weight (<30, >200 kg), BMI (<15, >49 kg/m 2 ) and weights that arose from BMI values outside this range. In pooled analyses, BMI was categorised according to World Health Organization definitions 2 as severe/moderate thinness (<17 kg/m 2 ), mild thinness (17-18.5 kg/m 2 ), normal range (18.5-22.9 and 23.0-24.9 kg/m 2 ), overweight (25-27.4 and 27.5-29.9 kg/m 2 ), obese Class I (30-32.4 and 32.5-34.9 kg/m 2 ) and obese Class II/III (35 kg/m 2 ). Where numbers in the extreme categories were small or to obtain study-specific, stratum-specific or tumour type-specific estimates we combined categories to obtain stable estimates. For each age-specific BMI investigated, studies with less than 10 cases among subjects with known BMI were excluded from the model to improve convergence.

Clinicopathological surrogate definition of breast cancer intrinsic subtypes
Immunohistochemistry data on estrogen (ER) and progesterone receptor (PR) status, as well as data on Human epidermal growth factor receptor-2 (HER2) oncogene expression was collected from the centres. Given the absence of data on the proliferation marker KI-67, we adapted clinicopathological surrogate definitions of luminal A and luminal Blike intrinsic breast cancer subtypes proposed by the St Gallen Expert Consensus. 3 We classified all ER+PR+HER2breast cancer as luminal A-like, ER+PR-HER2-and ER-PR+HER2-as luminal B-like HER2 negative, [ER+/PR+]HER2+ to luminal B-like HER2 positive, ER+PR-and ER-PR+ with HER2 status unknown as luminal B unclassified, ER+ or PR+ with other markers unknown as unclassified luminal, ER-PR-HER2-as triple-negative, ER-PR-HER2+ as HER2 enriched, ER-with PR unknown or PR-with ER unknown regardless of HER2 status as unclassified.

Statistical methods
Analyses were conducted using Stata 14.2 software 4 . BMI was analysed separately as a categorical and as a continuous variable (per 5 kg/m 2 ), assuming a log-linear dose-response relationship, the validity of which was checked using 5-knot restricted cubic spline models. 5 Hazard ratios (HR) as estimates of relative risk of breast cancer were obtained from Cox proportional hazards models 6 with attained age as the underlying time-scale. All analyses were conducted using Stata 14.2 software 4 . Pooled analyses were adjusted for attained age (implicit in the Cox model) and cohort (including country within EPIC). In multivariable-adjusted models we additionally adjusted for year of birth (<1930, 1930-9, 1940-9, 1950-9, 1960-9, 1970-9, to 1980), age at menarche (7-11, 12-13, 14 years, not known), age at first birth (<25, 25-34, 35 years, not known or not applicable), time since last birth (<5, 5-9, 10-14, 15-19, 20-24, 25-29, 30 years, not known or not applicable), parity (0, 1, 2, 3, parous but not known) and family history of breast cancer (yes, no, not known). Hazard ratios for breast cancer with respect to BMI were near-identical in age-and cohort-adjusted models compared with models additionally (fully) adjusted for other breast cancer risk factors. Fully adjusted models are therefore presented in the main paper. In models additionally adjusted for BMI at age 18-24, BMI at this age was coded as <18.5, 18.5-22.9, 23.0-24.9, 25.0-27.4, 27.5-29.9, 30 kg/m 2 ). Height was included as a continuous variable in models additionally adjusted for adult height. Covariate information was time-updated, where possible, with information from follow-up questionnaires for all pregnancy-related variables and family history of breast cancer. Subjects with missing covariate values were included in the analyses by fitting a category for the missing value. In order to include the case-cohort study (CSDLH) in the pooled data set, we included Barlow weights 7 for CSDLH corresponding to a sampling fraction of 5.0 percent as an off-set in the model. We also applied Barlow weights with a sampling fraction of (effectively) 1.0 to all other cohorts using the stcasecoh command in Stata 14.2 software 4 , which did not affect the results for those cohorts, but facilitated ease of obtaining results from a single pooled dataset.

Sensitivity analyses
In sensitivity analyses, we repeated the analyses (i) for BMI at ages 25 onwards adjusting for BMI at age 18-24 years ( Figure 1) (ii) for BMI at ages 25 onwards restricted to individuals for whom BMI at ages 18-24 was also available (eTable 2) (iii) excluding subjects whose weight was recalled or reported less than three years postpartum (eTable 7) (iv) restricting follow-up to person-time that was known to be, rather than assumed to be, premenopausal (eTable 7) (v) excluding the first two years of follow-up (eTable 7) (vi) restricting the endpoint to breast cancer with information on all of ER, PR and HER2 status (eTable 7) (vii) excluding one cohort at the time (eTable 8) (viii) additional adjustment for adult height (not shown) (vix) comparing with and without adjustment for polycystic ovary syndrome (PCOS), for centres with data on PCOS (not shown).   Abbreviations: BMI, Body-mass Index; CI, confidence interval; HR, hazard ratio; for cohort abbreviations see page 3. (a) HRs adjusted for attained age, cohort, year of birth, age at menarche, age at first birth, number of births, time since last birth and family history of breast cancer (b) Linear trend per 5 unit difference fitted across BMI values from 18.5 to 49.9 kg/m 2 (c) BMI data at age 18-24 years not collected for these cohorts (d) Insufficient number of cases with BMI at age 18-24 years for these cohorts eTable 9: Relative risk of premenopausal breast cancer in relation to BMI category at age 45-54 years, excluding subjects contributing to each successive cohort.

eTable 7: Relative risk of premenopausal breast cancer in relation to BMI category, by age at BMI. For (1) all subjects included in main analysis (2) breast cancer with known ER, PR and HER2 status as endpoint (3) excluding subjects with weight assessed less than three years postpartum (4) strictly known premenopausal time only (5) excluding first two years of follow-up.
Abbreviations: BMI, Body-mass Index; CI, confidence interval; HR, hazard ratio; for cohort abbreviations see page 3. HRs obtained from 5-knot restricted cubic spline model adjusted for attained age, cohort, year of birth, age at menarche, age at first birth, number of births, time since last birth and family history of breast cancer. Knot locations are based on Harrell's recommended percentiles 5 as specified in Stata 4 , corresponding to the 5 th , 25 th , 50 th , 75% and 95% percentile distribution. Solid line represents hazard ratio relative to the reference group of 20 kg/m 2 , dashed line represents 95% confidence interval of hazard ratio.