Most clinically apparent cancers represent states of partial immune control or escape. Immune cell–rich triple-negative breast cancers (TNBC) could represent the equilibrium phase, in which a strong immune response may result in pruning of cancer clonal and genomic heterogeneity by eliminating immunogenic sensitive cell clones. Cancers that have escaped immune surveillance have low lymphocytic infiltration and evolve toward greater clonal heterogeneity and genomic diversity. SCNA indicates somatic copy number alteration.
A, A total of 193 triple-negative breast cancer (TNBC) samples with follow-up data from The Cancer Genome Atlas were classified according to a previously described prognostic immune signature based on metagenes for high lymphocyte infiltration (major histocompatibility complex class II gene signature) in combination with low interleukin 8–vascular endothelial growth factor signature expression. Kaplan-Meier analysis of disease-free survival of the good (n = 25) and poor (n = 168) prognosis groups is shown (P values are from the log-rank test). B, Inverse association between T-cell metagene expression and mutant-allele tumor heterogeneity (MATH) score (ie, clonal heterogeneity) in TNBCs (solid line is the locally weighted scatterplot smoothing [LOWESS] fit, Spearman rank correlation P value). TNBCs in the good prognosis group had significantly lower MATH scores (mean rank, 59.7 vs 98.8; Mann-Whitney test P = .001) (eFigure 22 in the Supplement). C, Inverse association between T-cell metagene expression and somatic copy number alteration (SCNA) levels in TNBCs (solid line is the LOWESS fit, Spearman rank correlation P value). SCNA levels were significantly lower in the good prognosis group (mean rank, 45.8 vs 84.2; Mann-Whitney test P < .001) (eFigure 22 in the Supplement). D, Differences in mutational load and predicted neoantigen load in good and poor prognosis TNBC groups; error bars indicate 95% CIs. The y-axis is cropped at 170 mutated genes per sample, which excludes individual hypermutated samples with 300 to 1200 mutations (P values from Mann-Whitney test).
aP = .02 compared with good prognosis.
bP = .04 compared with good prognosis.
A and B, Association between T-cell metagene expression and mutant-allele tumor heterogeneity (MATH) score (ie, clonal heterogeneity) for triple-negative breast cancer (TNBC) in the poor (A) and good (B) prognosis groups (Figure 2A). C and D, Association between T-cell metagene expression and somatic copy number alteration (SCNA) levels for TNBC in the poor (C) and good (D) prognosis groups. Lines indicate 95 CIs. R2 and P values are from linear regression.
eMethods. Supplementary Methods
eFigure 1. Strategy of RNA-Seq and Whole-Exome-Seq Analyses for TNBC Classification
eFigure 2. ER, PR, and HER2 expression assessed by RNA-Seq and Agilent arrays
eFigure 3. Dependency of platform correlation on gene expression level
eFigure 4. Correlation between RNA-Seq and Affymetrix for Metagene clusters
eFigure 5. Classification of TNBC (n = 208) based on RNA-Seq data
eFigure 6. Correlation of MHC2 metagene expression and histological quantification of TILs in TCGA samples
eFigure 7. Classification algorithm of the prognostic immune signature
eFigure 8. Validation of improved prognosis of TNBC patients with “Good prognosis” signature in RNA-Seq data
eFigure 9. Mutational count distribution in 186 TNBC
eFigure 10. Relationship between SCNA levels and MATH in TNBC from TCGA
eFigure 11. Validation of Inverse Relationships Between Measures of Genomic Complexity and Immune Cell Infiltration in TNBC Using Different Immune Metagenes
eFigure 12. Prognostic value of histologically quantified TILs in the TCGA TNBC data Set
eFigure 13. Validation of Inverse Relationship of genomic heterogeneity and Immune Cell Infiltration Using Histologically Quantified TILs in the TCGA TNBC Data Set
eFigure 14. Differences in Mutation Count by Immune Cell Infiltration Metagenes and IL8/VEGF metagene expression categories
eFigure 15. Correlation of the Number of Predicted Neoantigens and Mutational Load
eFigure 16. Independence of MATH Score and Total Mutation Counts
eFigure 17. Validation Analyses in METABRIC Data Set
eFigure 18. Differences in Mutation Count, Neoantigen Count, and CYT by Molecular Subtype in Breast Cancer
eFigure 19. Confounding of Molecular Breast Cancer Subtypes on Predicted Neoantigen Count and CYT
eFigure 20. High intercorrelation of TIL metagenes
eFigure 21. Association between clonal heterogeneity and immune metagene expression
eFigure 22. Association of MATH and SCNA with prognostic groups in TNBC
eTable 1. Annotated Cancer Genes Mutated in ≥3 samples
eTable 2. TCGA samples included in the study
eTable 3. “Cancer genes” curated by Vogelstein and colleagues
eTable 4. Individual Genes Constituting TNBC Metagenes and Their Correlation With Affymetrix Microarray
Customize your JAMA Network experience by selecting one or more topics from the list below.
Karn T, Jiang T, Hatzis C, et al. Association Between Genomic Metrics and Immune Infiltration in Triple-Negative Breast Cancer. JAMA Oncol. 2017;3(12):1707–1711. doi:10.1001/jamaoncol.2017.2140
What are the genomic differences between triple-negative breast cancers with high lymphocytic infiltration and good prognosis and triple-negative breast cancers with less immune infiltration and worse prognosis?
In this study of genomic data sets, triple-negative breast cancers with high immune gene expression had lower clonal heterogeneity, fewer copy number alterations, lower somatic mutation, and lower neoantigen loads.
This study suggests that antitumor immune surveillance in immune-rich triple-negative breast cancers may lead to elimination of clones, lower clonal heterogeneity, and “simpler” genomes; the surviving neoplastic cell population exists at a near equilibrium with immune surveillance, explaining the better prognosis, and immune-poor triple-negative breast cancers have greater genomic diversity attributable to lesser immune restraint.
Why some triple-negative breast cancers (TNBCs) have high and others have low immune cell infiltration is unknown. Understanding how immune surveillance shapes the cancer genome could help in the selection of patients and the development of more effective immunotherapy strategies.
To examine the association between genomic metrics and the extent of immune infiltration in TNBCs.
Design, Setting, and Participants
This study, performed from June 1, 2015, through January 31, 2017, used DNA and RNA sequencing data and messenger RNA expression results from The Cancer Genome Atlas (TCGA) breast cancer data set (n = 1215) to calculate previously described immune metagene expression values and histologic lymphocyte counts to quantify immune infiltration and assign prognostic categories to TNBCs. It used the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) data set as an independent validation cohort. The study compared clonal heterogeneity, somatic total mutational load, neoantigen load, and somatic copy number alteration levels between immune-rich TNBC cohorts with good prognosis and immune-poor TNBC cohorts with poor prognosis. The study also compared the distribution of mutations in 119 canonical cancer genes.
Main Outcomes and Measures
Correlation between immune prognostic category and genomic metrics of the cancer.
This study of 193 TNBC samples with patient survival information found an inverse association between clonal heterogeneity and immune metagene expression (ρ = −0.395, P = 2 × 10−8). The study also found an inverse association between immune metagene expression and somatic copy number alteration levels (ρ = −0.484, P = 2 × 10−10). Lymphocyte-rich TNBCs with good prognosis had significantly lower mutation and neoantigen counts than did lymphocyte-poor TNBCs with poor prognosis. The robustness of the study results was confirmed by using various immune metagenes in the same TCGA data set and in the independent METABRIC data set.
Conclusions and Relevance
This study suggests that immune-rich TNBCs may be under an immune surveillance that continuously eliminates many immunogenic clones, resulting in lower clonal heterogeneity. These cancers may also represent the subset of TNBCs that could derive benefit from immune checkpoint inhibitor therapy to tilt the balance in favor of the immune system.
The importance of immune surveillance in determining the prognosis of various types of cancers is increasingly recognized. Understanding how the immune microenvironment influences the biology of cancer is important because it could lead to better patient selection strategies and more effective immunotherapies.1 More than 70% of breast cancers contain at least some tumor-infiltrating lymphocytes (TILs), and preclinical studies, as described by Schreiber et al,2 have found that antitumor immunity can eliminate some neoplastic cells, resulting in a precarious near equilibrium between the surviving clones and immune surveillance. Consistent with these observations, clinical studies3,4 also found that breast cancers with high immune infiltration, particularly the triple-negative breast cancer (TNBC) and ERBB2 (formerly HER2 or HER2/neu)–positive subtypes, have better prognosis. According to the immunoediting hypothesis of cancer progression, some cancers may be eliminated by an antitumor immune response before diagnosis, whereas most clinically apparent cancers represent states of escape or partial control by immune surveillance. One hypothesis is that cancers with greater genomic instability will have higher mutational burden, greater clonal heterogeneity, and higher genomic diversity, resulting in more neoantigens and therefore greater immune infiltration. Indeed, a positive correlation between the overall mutation or neoantigen loads and immune infiltration has been observed across cancer types.5,6 Alternatively, another hypothesis is that extensive lymphocytic infiltration is a consequence of a strong antitumor immune response that results in pruning of the genomic heterogeneity of the cancer by eliminating many immunogenic cell clones, whereas cancers with low lymphocytic infiltration may represent immune escape that also allows tumor evolution toward greater clonal heterogeneity and genomic diversity (Figure 1). Several studies support an inverse association between immune cell infiltration and intratumor clonal heterogeneity7 and somatic copy number alterations (SCNAs).8 In some cancers, the neoantigen load is also lower than expected, suggesting selective elimination of immunogenic clones.5
The goal of the present analysis was to assess the association between lymphocytic infiltration and genomic diversity in TNBCs. Specifically, we examine the association among immune infiltration measured by immune gene expression signatures, genomic complexity reflected by clonal heterogeneity, SCNAs, mutation load, neoantigen load, and patient prognosis.
In this study, performed from June 1, 2015, through January 31, 2017, previously reported prognostic immune gene expression signatures that were initially derived from DNA microarray data were transferred to RNA sequencing data of The Cancer Genome Atlas (TCGA) breast cancer cohort (n = 1215), as described in the eMethods and eFigure 1 in the Supplement. The RNA sequencing–based immune metagenes were highly correlated with the DNA microarray versions, successfully reproducing our previous immune clustering of TNBCs (n = 208) (eFigures 1-5 in the Supplement)9,10 and correlating well with histologic TIL quantification (eFigure 6 in the Supplement). Next, we classified the TNBC samples with survival information (n = 193) in the TCGA into good (n = 25) and poor prognosis (n = 168) categories. Good prognosis was defined as high immune infiltration (ie, major histocompatibility complex class II metagene expression in the top quartile) and low inflammation markers (ie, interleukin 8–vascular endothelial growth factor metagene expression below the median) (eFigures 7 and 20 in the Supplement). This classification was originally developed from an independent Affymetrix data set and remained strongly prognostic in the TCGA TNBC data (Figure 2A and eFigure 8 in the Supplement). We compared clonal heterogeneity measured by the mutant-allele tumor heterogeneity (MATH) score,11 which quantifies the dispersion of variant allele frequencies in each tumor, SCNAs as reported previously,8 mutational load, neoantigen load, and the distribution of mutations in 119 canonical cancer genes12,13 between the good and poor prognosis TNBC cohorts (eMethods, eTables 1-4, and eFigure 9 in the Supplement). All reported P values are 2-sided, and P < .05 was considered significant.
This study uses only publicly freely available open access data from TCGA, which is not unique to an individual and therefore deidentified. Institutional review board approval was not required according to Exemption 45 CFR 46.101(b)(4) from the US Department of Health and Human Services and the local institutional review board.
The immune-rich, good prognosis TNBC samples had significantly lower MATH scores, indicating lower clonal genomic heterogeneity (mean rank, 59.7 vs 98.8; Mann-Whitney test P = .001) (eFigure 22 in the Supplement). We observed a strong inverse association between MATH score and immune metagene expression across all TNBC samples (Figure 2B), which was particularly strong among the good prognosis samples (R2 = 0.479, P < .001) (Figure 3B). Levels of SCNAs were also significantly lower in the good prognosis group (mean rank, 45.8 vs 84.2; Mann-Whitney test P < .001) (eFigure 22 in the Supplement), with a significant inverse association between SCNAs and immune metagene expression across all samples (Figure 2C), which was again the strongest in the good prognosis group (R2 = 0.417) (Figure 3D). The SCNA levels and MATH scores showed only a weak positive correlation (R2 = 0.214) (eFigure 10 in the Supplement), suggesting that these metrics capture distinct genomic features, each separately associated with immune infiltration. The inverse association between immune infiltration and MATH score and SCNA levels was confirmed using different immune metagenes (the major histocompatibility complex class II metagene alone, B cell, and the cytolytic activity immune gene signature CYT)5 (eFigures 11 and 21 in the Supplement) and was also inversely correlated with histologic TIL counts (eFigures 12 and 13 in the Supplement). Good prognosis TNBCs also had significantly lower mutational load (mean rank, 70.4 vs 97.1; Mann-Whitney test P = .02) and neoantigen load (mean rank, 50.7 vs 70.1; Mann-Whitney test P = .04) (Figure 2D) compared with the poor prognosis samples. Lower overall mutation and neoantigen counts were also associated with high immune infiltration (eFigure 14 in the Supplement). Mutation load and neoantigen counts were highly correlated with one another (R2 = 0.68) (eFigure 15 in the Supplement) but not with MATH (R2 = 0.001) (eFigure 16 in the Supplement).
We also validated our results in the TNBC cohort (n = 283) of the independent METABRIC data set (eMethods in the Supplement). The cytolytic activity immune gene signature CYT5 showed a highly significant negative association with MATH (ρ = −0.286, P = 2 × 10−6) and a nonstatistically significant (ρ = −0.104, P = .14) association with chromosomal instability as a surrogate for SCNAs. The TIL-rich TNBC cluster also had a significantly lower MATH score compared with the TIL-poor cluster (eFigure 17 in the Supplement).
Our findings may appear to contradict an earlier publication5 that reported a weak positive association between neoantigen load and the cytolytic activity immune gene signature CYT when all breast cancers subtypes were examined together. We also observed this overall association but noticed that it may be in part attributable to the higher somatic mutation burden and higher immune infiltration in TNBCs compared with luminal cancers (eFigure 18 in the Supplement), as well as a small positive correlation in luminal B subtype (eFigure 19 in the Supplement), which was recently reported.14 When TNBC cancers are examined separately, the positive correlations between immune infiltration and genomic heterogeneity and mutation load are no longer seen; in fact, the opposite is observed, which is consistent with an immune pruning effect in TNBCs. Two other reports8,15 also support our observations. An earlier report15 noted that TNBCs with low clonal heterogeneity but high clonal mutational burden (ie, mutation burden adjusted for tumor clonality) have higher neoantigens per neoplastic clone and higher immune gene expression that is associated with greater chemotherapy sensitivity.15 Davoli et al8 independently observed a negative correlation between tumor aneuploidy and immune gene expression in a pan-cancer study.
A limitation of our study is our inability to determine a cause-and-effect relationship because our observations are correlative in nature. It is therefore possible that genomic alterations are also sculpting the immune system, and we observed a result of the interactive effect of each other. In addition, tumor purity may affect mutation calling and confound the analysis. Further discussion of both these issues can be found in the eMethods.
We demonstrate that high immune infiltration is mostly seen in primary TNBCs with low clonal heterogeneity, fewer SCNAs, and lower somatic mutation and neoantigen loads. We suggest that these findings may be a consequence of effective immune surveillance that continuously eliminates immunogenic clones, resulting in lower clonal heterogeneity. The better prognosis of these cancers is consistent with strong immune surveillance and precarious equilibrium between the cancer and the immune system. Surgical resection of the primary tumor and adjuvant chemotherapy may assist the immune system. These cancers may also represent the subset of TNBCs that could derive further benefit from immune checkpoint inhibitor therapy.
Corresponding Author: Thomas Karn, PhD, Department of Obstetrics and Gynecology, Goethe-University Frankfurt, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany (email@example.com).
Accepted for Publication: May 23, 2017.
Published Online: July 27, 2017. doi:10.1001/jamaoncol.2017.2140
Author Contributions: Dr Karn had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Karn, Hatzis, Rody, Holtrich, Bianchini, Pusztai.
Acquisition, analysis, or interpretation of data: Karn, Jiang, Hatzis, El-Balat, Sänger, Holtrich, Bianchini.
Drafting of the manuscript: Karn, Hatzis, Holtrich, Bianchini, Pusztai.
Critical revision of the manuscript for important intellectual content: Karn, Jiang, Hatzis, Sänger, El-Balat, Rody, Holtrich, Becker, Bianchini.
Statistical analysis: Karn, Jiang, Hatzis.
Obtained funding: Karn, Becker, Rody, Holtrich.
Administrative, technical, or material support: Karn, Sänger, El-Balat, Rody, Holtrich, Becker, Pusztai.
Study supervision: Karn, Becker, Holtrich.
Conflict of Interest Disclosures: None reported.
Funding/Support: This work was supported by grant M67 from the H. W. & J. Hector-Stiftung, Mannheim, Germany (Drs Holtrich, Rody, and Karn); the Breast Cancer Research Foundation (Drs Pusztai and Hatzis); the Susan Komen Foundation (Dr Pusztai); Yale Cancer Center Core Grant National Institutes of Health, National Cancer Institute P30CA16359 (Drs Pusztai and Hatzis); and grant MFGA 13428 from the Associazione Italiana per la Ricerca sul Cancro (Dr Bianchini).
Role of the Funder/Sponsor: The funding sources had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Meeting Presentation: A portion of this research was presented at the 2016 San Antonio Breast Cancer Symposium; December 7, 2016; San Antonio, Texas.
Additional Contributions: We are grateful to the TCGA Research Network (http://cancergenome.nih.gov/) for providing the data analyzed in this study.