The points show the value by decile with 95% confidence intervals extending from each point horizontally; The line of best fit from a logistic regression is shown in comparison with the theoretical line for perfect calibration. The data and cut points for this plot are detailed in eTable 6 in the Supplement. SNP18 indicates a panel of 18 single-nucleotide polymorphisms.
eFigure 1. Flow-chart study population
eTable 1. 18 SNPs used in this study
eTable 2. Comparison of risk factors in study with the wider cohort.
eTable 3. Discrimination and calibration of SNP18.
eTable 4. Re-classification table for 10-year risk groups.
eTable 5. 10y risk groups for the Tyrer-Cuzick model alone and in combination with SNP18 and mammographic density.
eTable 6. Results for SNP18 by decile.
Customize your JAMA Network experience by selecting one or more topics from the list below.
van Veen EM, Brentnall AR, Byers H, et al. Use of Single-Nucleotide Polymorphisms and Mammographic Density Plus Classic Risk Factors for Breast Cancer Risk Prediction. JAMA Oncol. 2018;4(4):476–482. doi:10.1001/jamaoncol.2017.4881
Can panels of single-nucleotide polymorphisms (SNPs) be combined with measurement of mammographic density and classic risk factors to improve breast cancer risk assessment?
In a general screening cohort of 9363 women, a panel of 18 SNPs was similarly predictive whether unadjusted or adjusted for both mammographic density and classic risk factors.
SNP risk panels substantially improve the ability of breast cancer risk prediction models to accurately identify women who may benefit most from preventive therapy or additional screening modalities.
Single-nucleotide polymorphisms (SNPs) have demonstrated an association with breast cancer susceptibility, but there is limited evidence on how to incorporate them into current breast cancer risk prediction models.
To determine whether a panel of 18 SNPs (SNP18) may be used to predict breast cancer in combination with classic risk factors and mammographic density.
Design, Setting, and Participants
This cohort study enrolled a subcohort of 9363 women, aged 46 to 73 years, without a previous breast cancer diagnosis from the larger prospective cohort of the PROCAS study (Predicting Risk of Cancer at Screening) specifically to evaluate breast cancer risk-assessment methods. Enrollment took place from October 2009 through June 2015 from multiple population-based screening centers in Greater Manchester, England. Follow-up continued through January 5, 2017.
Genotyping of 18 SNPs, visual-assessment percentage mammographic density, and classic risk assessed by the Tyrer-Cuzick risk model from a self-completed questionnaire at cohort entry.
Main Outcomes and Measures
The predictive ability of SNP18 for breast cancer diagnosis (invasive and ductal carcinoma in situ) was assessed using logistic regression odds ratios per interquartile range of the predicted risk.
A total of 9363 women were enrolled in this study (mean [range] age, 59 [46-73] years). Of these, 466 were found to have breast cancer (271 prevalent; 195 incident). SNP18 was similarly predictive when unadjusted or adjusted for mammographic density and classic factors (odds ratios per interquartile range, respectively, 1.56; 95% CI, 1.38-1.77 and 1.53; 95% CI, 1.35-1.74), with observed risks being very close to expected (adjusted observed-to-expected odds ratio, 0.98; 95% CI, 0.69-1.28). A combined risk assessment indicated 18% of the subcohort to be at 5% or greater 10-year risk, compared with 30% of all cancers, 35% of interval-detected cancers, and 42% of stage 2+ cancers. In contrast, 33% of the subcohort were at less than 2% risk but accounted for only 18%, 17%, and 15% of the total, interval, and stage 2+ breast cancers, respectively.
Conclusions and Relevance
SNP18 added substantial information to risk assessment based on the Tyrer-Cuzick model and mammographic density. A combined risk is likely to aid risk-stratified screening and prevention strategies.
Breast cancer is the most commonly diagnosed cancer among women worldwide. Approximately half of all breast cancers in women with a family history of the disease are explained by a known genetic component.1,2 Pathogenic variants in BRCA1/BRCA2 and single-nucleotide polymorphisms (SNPs) explain a large proportion of the risk in women with a strong family history. SNPs also contribute to the development of nonfamilial breast cancer in women, accounting for about 16% of genetic risk.2 On a population basis, the polygenic risk conferred by these susceptibility SNPs is greater than the risk from single pathogenic variants in a single high- or moderate-risk gene,1 especially for women without any family history of breast cancer.3,4 Dependent on genotyping of susceptibility SNPs (ie, 0 risk alleles, 1 risk allele, or 2 risk alleles), a risk estimate can be derived, which may be used for risk prediction by combining the risk estimates for each SNP in a polygenic risk score.
Breast cancer risk models mainly include classic risk factors including increased risk from family history, younger age at menarche, older age at first full-term pregnancy, later menopause, age, body mass index (BMI), benign breast disease, and current use of hormone replacement therapy.5,6 In addition, high mammographic density is also a well-delineated risk factor for breast cancer, and several studies have found that mammographic density improves the accuracy of risk-prediction models.7,8 Recent studies have considered the value of including SNP data into risk-prediction algorithms with promising results,9-11 but data are very scarce on including both mammographic density and SNP characteristics into risk-prediction models.12
We have collected data on classic risk factors and mammographic density from 57 902 women participating in the Predicting Risk of Cancer at Screening (PROCAS) study,13 which recruited women attending a national breast cancer screening program. A subcohort of these women volunteered to provide saliva samples that were genotyped for 18 breast cancer susceptibility SNPs (SNP18).14 Herein, we report on the predictive ability of SNP18 alone and when adjusted for classic risk factors (annotated by the Tyrer-Cuzick model5) and mammographic density in both a case-cohort study and in a secondary analysis that only included women in the subcohort without breast cancer when they provided the saliva sample.
A total of 57 902 women aged between 46 and 73 years from the Greater Manchester area were recruited between October 2009 and June 2015. Women were recruited at the time of attendance for mammographic screening in the National Health Service Breast Screening Programme. Breast cancer risk factors were self-reported by the women via completion of a 2-page paper questionnaire. Women were excluded from this study if they had been diagnosed with breast cancer before completing the questionnaire; cancers detected as a result of the screening test were included.
The PROCAS study13 was approved by the North Manchester Research Ethics Committee, and written informed consent was obtained from each participant.
Saliva samples for the present study were collected from 9956 of the PROCAS participants after their initial study mammogram at drop-in days in Greater Manchester, using Oragene saliva lysate tubes (DNA Genotek Inc), and DNA extraction was performed using Gen-Probe extraction.
Women who lived within the smaller defined Withington area (South Manchester) were invited to participate for subsequent risk assessment including a saliva sample. All women with breast cancer diagnosed after completion of the questionnaire were invited to provide saliva samples and participate as case patients. Breast cancer diagnosis, invasive or ductal carcinoma in situ (DCIS), occurred at the entry screen or subsequently but before January 5, 2017, and was ascertained through monthly updates from (National Health Service) Breast Screening Systems. Saliva samples were collected between October 2009 and December 2013, close to but after the time of the woman’s screening visit.
A total of 17 SNPs (eTable 1 in the Supplement) were genotyped by a custom-designed Sequenom MassARRAY iPLEX assay (Agena Bioscience GmbH), as previously described,15 and 1 SNP (rs10931936) was genotyped using a TaqMan SNP Genotyping Assay (Fisher Scientific-UK Ltd). Two duplicate positive and 2 negative controls for quality assurance were genotyped for each 96-sample plate.
A polygenic risk score for SNP18 was computed using published per-allele odds ratios (ORs) obtained from the iCOGS database and allele frequencies, as described earlier.10 Briefly, SNP18 was calculated by multiplying the per-allele OR for each SNP and normalizing the risk by the average risk expected in the population using published minor allele frequencies.10
Mammographic density at entry to the cohort was estimated independently by 2 readers using a visual analog scale (VAS), as previously described.8 Briefly, each mammogram was scored on a linear scale ranging from 0 to 100 for the density of the breast. The derived percentage density was adjusted for BMI and age and reported as a density residual (DR) and was also expressed as an OR by calibrating and standardizing it to the wider cohort.8 Women with bilateral cancer on prevalent study screen or with breast implants had no assessable VAS score and were given a pro rata DR of 1.0. The Tyrer-Cuzick 10-year risk (v6) was based on classic risk factors from the questionnaire self-reported at entry.
Baseline characteristics were compared between case patients and controls, and between those included in the case-cohort study and those not. Differences between the DR after adjustment for parity were assessed by a Wald test from a linear model. The predictive ability (discrimination) of SNP18 was assessed using logistic regression with Wald confidence intervals (CIs) , and expressed as the OR per interquartile range (IQR) in controls. Calibration of the observed-to-expected (O/E) SNP18 OR was estimated using the log score regression coefficient and 95% Wald CI, so that O/E = 1 would indicate perfect calibration, and further inspected by SNP18 decile, with CIs following the Wilson method for the binomial parameter. Adjusted analyses were used to assess SNP18 beyond mammographic density and the Tyrer-Cuzick model. Subgroups were considered for (1) absence of breast cancer at the time the saliva sample was provided (prospective substudy); (2) estrogen-receptor (ER) status using a Wilcoxon test; and (3) presence of DCIS and/or invasive cancer.
A combined 10-year risk was calculated assuming independence by multiplying the Tyrer-Cuzick 10-year absolute risk by DR and SNP18. It was stratified in 10-year risk groups as follows: less than 2%; 2.00% to 3.49%; 3.50% to 4.99%, 5.00% to 7.99%, and 8.00% or higher risk,8 for which the frequency of cases and percentage of controls were determined, and by cancer stage, time of diagnosis, and only those cases of diagnosed breast cancer that occurred after the saliva sample collection. A sensitivity analysis using computer simulation assessed the predicted percentage of the wider PROCAS cohort with VAS measurements (50 588 participants) in each risk category, based on the results from the case-cohort study (eMethods in the Supplement). Area under the receiver operating characteristic curve (AUC) statistics with 95% DeLong CIs were calculated to assess discrimination.
The number of cancers expected was estimated from Tyrer-Cuzick 10-year risks censored at the time of breast cancer diagnosis, death, or January 5, 2017, whichever was earliest. Exact Poisson CIs were given for rates. A 2-sided P < .05 was considered significant.
A total of 57 902 participants were recruited to the PROCAS cohort, of whom 907 were diagnosed with breast cancer before entry. SNP18 was available for 9899 women. After 536 women diagnosed with breast cancer before entering PROCAS were excluded, 9363 women were included in the cohort, of whom 466 were diagnosed with breast cancer (including 89 with DCIS) at the baseline mammogram or during follow-up (eFigure in the Supplement).
The quality of the Sequenom MassARRAY iPLEX and TaqMan assays was assured, as there was 100% concordance of genotyping between duplicate samples for all SNPs.
The baseline characteristics reported in eTable 2 in the Supplement indicate that most women were overweight (BMI, >25) and older than 56 years. Compared with those in the cohort who were not included, controls were older, less overweight, more likely to have had children when older or not at all, had a family history of breast cancer, and had a previous breast biopsy (eTable 2 in the Supplement). Cases included were also slightly older, less overweight, and less likely to have children. Because some selection bias was reflected in questionnaire risk factor differences between women who volunteered to donate saliva and those who did not, we only adjusted for the Tyrer-Cuzick model in the main case-cohort analysis and did not directly assess its predictive ability. There was a significant difference in mammographic density between the controls included and excluded (mean breast density, 25.8% vs 23.9%; P < .001), with a higher average density for those included. However, the difference was mostly explained by BMI and parity (DR adjusted for parity).
A nonsignificant correlation was observed between the SNP18 and DR (Spearman 0.019, P = .07), but a significant small correlation between SNP18 and the 10-year Tyrer-Cuzick risk was seen (Spearman 0.031, P = .003), indicating that these risk factors have very small correlations.
SNP18 polygenic risk score (OR) was higher in case patients (median, 1.12; IQR, 0.87-1.33) than controls (median, 1.01; IQR, 0.77-1.19). It was almost perfectly calibrated across the spectrum of predicted relative risk subgroups (unadjusted O/E OR, 1.03; 95% CI, 0.74-1.32) (Figure), indicating that SNP18 is a very good predictor across the continuum of risk and had an unadjusted interquartile OR of 1.56 (95% CI, 1.38-1.77). Results were very similar after adjustment for the Tyrer-Cuzick model (interquartile OR, 1.54; 95% CI, 1.36-1.75; O/E OR, 1.00; 95% CI, 0.71-1.30) and showed discrimination comparable to that of mammographic density (adjusted interquartile OR, 1.50; 95% CI, 1.33-1.70; see also Brentnall et al8). Furthermore, additional adjustment for mammographic density also did not substantially affect the predictive power of SNP18 (interquartile OR, 1.53; 95% CI, 1.35-1.74; O/E OR, 0.98; 95% CI, 0.69-1.28). Similar results were obtained for the subgroup of 169 prospective cancers (adjusted O/E OR, 1.08; 95% CI, 0.60-1.56; eTable 3 in the Supplement). As expected, there was little difference in the performance of SNP18 when ER-negative cancers were excluded (409 ER-positive cancers and 14 unknown; eTable 3 in the Supplement), and SNP18 was much less predictive for the 43 ER-negative cancers (P = .08 for heterogeneity). There was also very little difference between SNP18 as a predictor of invasive breast cancer or DCIS (eTable 3 in the Supplement).
When combining the risk from the Tyrer-Cuzick model, mammographic density and SNP18 assuming independence we observed that 16% (76/466) of cases and 9.5% (841/8897) of controls moved into the increased risk category (≥5% 10-year risk) compared with using the Tyrer-Cuzick model alone, but only 5% (22/466) of cases and 4% (353/8897) of controls moved out of this category. Thus the number of cases in this group increased by an absolute 11%, while the number of controls only by 5.5% (eTable 4 in the Supplement).
The ability of a combination of SNP18, mammographic density, and the Tyrer-Cuzick model 10-year risk to improve risk stratification is further illustrated in eTable 5 in the Supplement. As detailed there, individuals in the highest-risk (>8%) group were more than 4 times as likely to develop cancer, as measured by both the prevalent mammogram and prospectively, as the low-risk (<2%) group. Additionally, 14% of the cancers occurred in this group, which made up only 6% of the population (Table). Stage 2 or higher cancers were also more likely to develop in the moderate/high-risk population: 42% of high-stage cancers and 35% of all interval cancers were identified in the 19% of those who were at 5% or higher 10-year risk. The moderate/high-risk group were 5-fold more likely to develop a high-stage cancer than the low-risk group (P < .001). In the NICE-defined16 8% or higher 10-year risk group, 22% of the stage 2 or higher cancers and 16.7% of the interval cancers were identified in just 6% of the population, representing a 5-fold risk for interval and an 8-fold risk for high-stage disease (Table) compared with the low-risk group. A sensitivity analysis (eTable 5 in the Supplement) to model risk stratification in the wider PROCAS cohort showed a trend similar to that found in the study controls (eTable 5 in the Supplement), with 37% in the low-risk category, but proportionally fewer women were assigned to the higher-risk groups owing to the increased risk of study controls relative to the wider cohort (eTable 2 in the Supplement).
We finally considered a subset of those who were unaffected at the time of DNA sampling, consisting of 9064 women who had a total of 44 419 years of follow-up to diagnosis or censoring on January 5, 2017. In total, 167 breast cancers occurred, including 28 DCIS (16.7%) after a mean 4.9 years follow-up at an annual rate of 3.7 per 1000 women (3.2 for invasive cancer only), and 155 cancers were expected from Tyrer-Cuzick, density, and SNP18 combined (O/E OR, 1.07; 0.91 for invasive cancer). The combined assessment and that of the complete case-cohort study were similarly predictive (eTable 3 in the Supplement). Breast cancer rates in each group were within the predicted range from the combined assessment. The observed risk from Tyrer-Cuzick had an AUC of 0.58 (95% CI, 0.52-0.62); when mammographic density was also incorporated, the AUC was 0.64 (95% CI, 0.60-0.68); and when SNP18 was further incorporated, the AUC was 0.67 (95% CI, 0.62-0.71).
This study found that SNP18 stratified breast cancer risk beyond classic factors and mammographic density. The results underline the additional value of incorporating polygenic risk scores with mammographic density and classic risk factors to evaluate risk in women participating in a nonselective national screening program. The addition of SNP18 gave a better risk stratification, with more women in both the lower- and higher-risk groups.
Risk-adapted screening practices that might be evaluated include starting screening later in lower-risk women, increasing the screening interval for these women, or potentially not inviting them for screening at all. We found that relatively few interval or higher-stage cancers occurred in the lowest-risk group in our UK cohort undertaking screening every 3 years. In countries with programs of screening every 2 years, there might be scope to save health resources by reducing screening to every 3 years for this low-risk group, constituting 33% of the study control population and possibly up to 37% of the whole PROCAS cohort. In contrast, UK NICE guidelines16 would permit women in the high-risk group to receive annual screening until age 60 years and to be offered preventive therapy. This group (making up 6% of the total at ≥8% 10-year risk) could even be considered for risk-reducing surgery, and a further 11% at moderate risk might consider chemoprevention. The rationale for this is that the data suggest that approximately 1 in 6 women in the population are at moderate/high risk (≥5% 10-year risk) and are likely to develop approximately 35% of the total interval cancers and 42% of the stage 2 or higher cancers in an every-3-year screening program. There was also no excess of cases of DCIS in the moderate/high-risk groups (11% vs 16% for lower risks); indeed all 3 cases of DCIS in the high-risk group were high-grade disease. There is therefore great scope for down-staging of potentially lethal cancers with more frequent screening (perhaps using screening techniques such as tomosynthesis mammography, automated breast ultrasonography, and magnetic resonance imaging, particularly in those where masking from density is an issue). Breast cancer risk reduction with tamoxifen, raloxifene, or an aromatase inhibitor may prevent between 30% and 50% of these cancers.16,17 In addition, 44% of the stage 2 or higher breast cancers occurred in the average/intermediate-risk groups; one might consider a moving to biannual screening to down-stage these cases.
The participant characteristics indicate that women who are aware of their elevated risk were more likely to agree to further investigation related to their risk. Although there might be a bias toward including women who are at higher risk based on classic risk factors, this is very unlikely to have influenced the SNP18 genotypic signature in controls because it was close to the expected risk (eTable 2 in the Supplement) and only slightly correlated with classic risk factors.
The weak correlation between SNP18 and the other risk factors, DR, and Tyrer-Cuzick risk, justifies the use of all 3 measures in combination. Indeed, there was effectively no change in OR when adjusted for Tyrer-Cuzick. This is consistent with the findings of Vachon et al,12 who also did not find an association between polygenic risk score and mammographic density.
The 18 SNPs used in this study have previously been found to be most strongly associated with ER-positive disease.18-20 Although there was no significant heterogeneity in performance of SNP18 by ER status, this was to be expected owing to lack of power, and the point estimates for the interquartile OR of SNP18 was substantially lower for ER-negative cancers (1.09 vs 1.61).
A limitation of the main case-cohort analysis is that saliva was not provided at recruitment of the wider cohort. However, this appeared to have minimal impact on the results because they were similar in our secondary analysis of cancers diagnosed after sample donation. Also, cancer notifications from the National Health Service would not record breast cancer in emigrants, but this is unlikely to alter the findings because yearly international emigration rates from Greater Manchester in women aged 47 to 73 years are likely to be around 0.06%.21,22
Of note we used a different density measure (VAS instead of BI-RADS density23) than some previous studies,12,22 and a different model to combine classic risk factors (Tyrer-Cuzick model instead of the Breast Cancer Surveillance Consortium risk prediction model,22 which already incorporates mammographic density). Using a continuous scale measure like VAS may well provide better discrimination than leveraging women into a particular 1 of 4 BI-RADS categories. However, perhaps neither scale is ideal owing to interreader and intrareader variation in determining mammographic density.24
Although there are now more than 100 SNPs linked to breast cancer risk, it remains to be determined in prospective cohort studies whether additional SNPs will substantially improve risk prediction beyond classic factors and mammographic density.23,25 SNP18 explains a large proportion of the current known familial component derived from SNPs, and the ORs used are likely more robust than many of the more recently identified SNPs that have very small ORs. Most SNPs identified to date have been more strongly associated with ER-positive disease, as in this study. Further analysis within the breast cancer association consortium case-control studies may help to identify SNPs that are more predictive for other subtypes of breast cancer.
In summary, the increased sensitivity (proportion of cancers in a group identified over a given period) for a given positive predictive value (the risk in the group over that period) is an important aim for risk-stratified screening and prevention strategies. In this study, SNP18 substantially improved the accuracy of risk prediction when combined with Tyrer-Cuzick estimates and mammographic density. Routinely incorporating SNP18 into risk-prediction models will provide women attending routine screening more informative risk estimates that could be used in more personalized prevention and early-detection strategies.
Accepted for Publication: November 5, 2017.
Corresponding Author: D. Gareth R. Evans, MD, Manchester Centre for Genomic Medicine, St Mary’s Hospital, Oxford Road, Manchester, M13 9WL, England (email@example.com).
Published Online: January 18, 2018. doi:10.1001/jamaoncol.2017.4881
Author Contributions: Drs Evans and Brentnall had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Howell, Newman, Cuzick, Evans.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: van Veen, Brentnall, Howell, Cuzick, Evans.
Critical revision of the manuscript for important intellectual content: van Veen, Byers, Harkness, Astley, Sampson, Howell, Newman, Cuzick, Evans.
Statistical analysis: Brentnall, Harkness, Cuzick.
Obtained funding: Howell, Newman, Evans.
Administrative, technical, or material support: van Veen, Byers, Sampson, Evans.
Supervision: Newman, Evans.
Conflict of Interest Disclosures: Drs Cuzick and Brentnall report royalty payments through Cancer Research UK for commercial use of the Tyrer-Cuzick algorithm. No other disclosures are reported.
Funding/Support: This work was supported by Prevent Breast Cancer (GA09-002 and GA11-002) and the National Institute for Health Research (NF-SI-0513-10076 to D.G.R.E.).
Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Additional Contributions: Many thanks to Paula Stavrinos, BSc, Jake Southworth, Lynne Fox, Jill Fox, Louise Donnelly, PhD, Sarah Sahin, Donna Watterson, BSc, Faiza Idries, BSc, and Helen Ruane (Manchester University Hospital Foundation Trust); and Sarah Ingham, PhD (University of Manchester), for administrative support and data management. We thank Antonis Antoniou, PhD (University of Cambridge), for providing the allele frequencies and univariate odds ratios associated with each SNP from the iCOGS database. These persons received no compensation for their contributions beyond that received in the normal course of their employment.