Patients are stratified by immune-related gene pair index (IRGPI) (low vs high risk). A and C, Overall survival among patients with stages I and II nonsquamous NSCLC in meta-testing. B and D, Overall survival among all patients with stages I and II nonsquamous NSCLC in 3 independent validation data sets. Hazard ratios (HRs) and 95% CIs are for high vs low immune risk. P values comparing risk groups were calculated with the log-rank test.
Univariate Cox proportional hazards regression was applied to estimate HRs between IRGPI high and low risk in each data set within stage subgroups. Early represents patients with stage I and stage II nonsquamous non–small-cell lung cancer (NSCLC). The length of horizontal line corresponds to the confidence interval, and the size of the HR data marker is inversely proportional to the confidence interval. Vertical dotted line indicates HR of 1.0. DCC indicates Director’s Challenge Consortium; NA, not applicable; TCGA, The Cancer Genome Atlas.
aHazard ratio is not reported because of unstable estimation of Cox proportional hazards regression model.
Groups are stratified by low and high risk. Data from top and bottom sections of snap-frozen tissue samples are given. Error bars indicate estimated 95% CI.
aP < .001, compared with the low immune risk group, by Wilcoxon rank sum test.
bP < .005, compared with the low immune risk group, by Wilcoxon rank sum test.
The RMS curve for immune-related gene pair index (IRGPI) and immune-clinical prognostic index (ICPI) scores was plotted for (A) meta-testing, (B) Director’s Challenge Consortium (DCC), (C) Gene Expression Omnibus (GSE30219), and (D) The Cancer Genome Atlas (TCGA) data sets. Each point represents the RMS time of corresponding IRGPI and ICPI scores. The RMS curves showed a larger slope in all data sets for ICPI, indicating the superior estimation of survival with ICPI. Concordance index (C-index) for IRGPI and ICPI was also provided. P value represents the difference between IRGPI and ICPI in terms of C-index.
eMethods. Data Preprocessing, Assessment of Robustness, and Comparison With Commercialized Lung Biomarker
eResults. Analysis and Robustness of IRGPs
eTable 1. Details About the Data Sets Used in This Study
eTable 2. Clinical and Pathologic Features of Patients in Meta-training, Meta-testing, and Independent Validation Cohorts
eTable 3. Model Information About IRGPI
eTable 4. Univariate and Multivariate Analyses of Prognostic Factors in Meta-training, Meta-testing, and Independent Validation Data Sets
eTable 5. Biological Processes Overrepresented by Genes Consisting of IRGPI
eTable 6. RMS Time Ratio Between Low- and High-Risk Groups Based on IRGPI or ICPI in Different Data Sets
eFigure 1. Overview of the Construction and Validation of Immune and Composite Immune/Clinical Signatures
eFigure 2. Time-Dependent ROC Curve for IRGPI in the Meta-training Data Set at 5 Years
eFigure 3. Kaplan-Meier Curve of Overall Survival for Patients With Different IRGPI Risks
eFigure 4. Kaplan-Meier Curve of Overall Survival for Early- and Late-Stage Patients With Different IRGPI Risks
eFigure 5. Kaplan-Meier Curve of Overall Survival for Stage IA and IB Patients With Different IRGPI Risks
eFigure 6. C-index Comparison Between IRGPI and 2 Existing Biomarkers
eFigure 7. Time-Dependent ROC for ICPI and RMS Curve for ICPI and IRGPI in Meta-training Data Set
eFigure 8. Kaplan-Meier Curves for Overall Survival of All Patients Stratified by the IRGPI and the ICPI
eFigure 9. C-index Comparison Between ICPI and mPS Score in Validation Data Sets
Customize your JAMA Network experience by selecting one or more topics from the list below.
Li B, Cui Y, Diehn M, Li R. Development and Validation of an Individualized Immune Prognostic Signature in Early-Stage Nonsquamous Non–Small Cell Lung Cancer. JAMA Oncol. 2017;3(11):1529–1537. doi:10.1001/jamaoncol.2017.1609
Can molecular profiling of immune-related genes be used to estimate prognosis in early-stage nonsquamous non–small cell lung cancer?
In this multiple-cohort study that analyzed frozen tumor tissue samples from 2414 patients, an immune signature of 25 gene pairs significantly stratified patients into low- vs high-risk groups for overall survival across and within subpopulations at various tumor stages (I, IA, IB, or II).
Dysregulated immune contexture may contribute to the survival differences among patients with nonsquamous non–small cell lung cancer.
The prevalence of early-stage non–small cell lung cancer (NSCLC) is expected to increase with recent implementation of annual screening programs. Reliable prognostic biomarkers are needed to identify patients at a high risk for recurrence to guide adjuvant therapy.
To develop a robust, individualized immune signature that can estimate prognosis in patients with early-stage nonsquamous NSCLC.
Design, Setting, and Participants
This retrospective study analyzed the gene expression profiles of frozen tumor tissue samples from 19 public NSCLC cohorts, including 18 microarray data sets and 1 RNA-Seq data set for The Cancer Genome Atlas (TCGA) lung adenocarcinoma cohort. Only patients with nonsquamous NSCLC with clinical annotation were included. Samples were from 2414 patients with nonsquamous NSCLC, divided into a meta-training cohort (729 patients), meta-testing cohort (716 patients), and 3 independent validation cohorts (439, 323, and 207 patients). All patients underwent surgery with a negative surgical margin, received no adjuvant or neoadjuvant therapy, and had publicly available gene expression data and survival information. Data were collected from July 22 through September 8, 2016.
Main Outcomes and Measures
Of 2414 patients (1205 men [50%], 1111 women [46%], and 98 of unknown sex [4%]; median age [range], 64 [15-90] years), a prognostic immune signature of 25 gene pairs consisting of 40 unique genes was constructed using the meta-training data set. In the meta-testing and validation cohorts, the immune signature significantly stratified patients into high- vs low-risk groups in terms of overall survival across and within subpopulations with stage I, IA, IB, or II disease and remained as an independent prognostic factor in multivariate analyses (hazard ratio range, 1.72 [95% CI, 1.26-2.33; P < .001] to 2.36 [95% CI, 1.47-3.79; P < .001]) after adjusting for clinical and pathologic factors. Several biological processes, including chemotaxis, were enriched among genes in the immune signature. The percentage of neutrophil infiltration (5.6% vs 1.8%) and necrosis (4.6% vs 1.5%) was significantly higher in the high-risk immune group compared with the low-risk groups in TCGA data set (P < .003). The immune signature achieved a higher accuracy (mean concordance index [C-index], 0.64) than 2 commercialized multigene signatures (mean C-index, 0.53 and 0.61) for estimation of survival in comparable validation cohorts. When integrated with clinical characteristics such as age and stage, the composite clinical and immune signature showed improved prognostic accuracy in all validation data sets relative to molecular signatures alone (mean C-index, 0.70 vs 0.63) and another commercialized clinical-molecular signature (mean C-index, 0.68 vs 0.65).
Conclusions and Relevance
The proposed clinical-immune signature is a promising biomarker for estimating overall survival in nonsquamous NSCLC, including early-stage disease. Prospective studies are needed to test the clinical utility of the biomarker in individualized management of nonsquamous NSCLC.
Non–small cell lung cancer (NSCLC) accounts for approximately 85% of lung cancer, the leading cause of death from cancer in the United States and worldwide.1 Biomarkers that can reliably estimate disease prognosis and patient survival would have tremendous value in guiding the management of lung cancer.2 For example, most patients with stage I NSCLC currently do not receive adjuvant systemic treatment after local therapy because several large randomized studies3-5 have failed to show a survival benefit among unselected patients, for whom the toxic effects associated with chemotherapy may outweigh the potential benefit for many patients. Thus, identification of the subset of patients at highest risk for recurrence and death who have the greatest need for additional systemic therapy is needed.
A number of studies have proposed gene expression–based signatures for survival stratification in patients with NSCLC.6-10 Unfortunately, none has been incorporated into routine clinical practice owing to issues such as overfitting on small discovery data sets and lack of sufficient validation.11 The availability of public, large-scale gene expression data sets brings the opportunity to identify potentially more reliable lung cancer biomarkers.8,12,13 To use all this information effectively, however, the diversity of data also represents a daunting challenge. Traditional approaches using gene expression levels require appropriate normalization, which is a difficult task given the potential biological heterogeneity among data sets and technical biases across measurement platforms.14 Instead, methods based on the relative ranking of gene expression levels eliminate the requirement for data preprocessing, such as scaling and normalization, and have been shown to produce robust results in various applications, including cancer classification.15-17
Various components of the immune system have been shown to be a determining factor during cancer initiation and progression.18,19 Evading immune destruction has been recognized as an emerging hallmark of cancer.20-23 Recent immunotherapies targeting specific immune checkpoints such as programmed death 1 or programmed death ligand 1 have demonstrated a remarkable, durable response in NSCLC.24,25 Certain histopathologic patterns, such as intratumoral infiltration by cytotoxic lymphocytes, have also been associated with better prognoses in several cancer types, including NSCLC.26-30 However, the molecular characteristics describing tumor-immune interaction remain to be comprehensively explored regarding their prognostic potential in NSCLC.31-34
In this study, we combined multiple gene expression data sets to develop and validate an individualized prognostic signature for nonsquamous NSCLC based on immune-related gene pairs (IRGPs). To leverage the complementary value of molecular and clinical characteristics, we integrated the immune signature with clinical factors to build a composite prognostic index, which allowed improved estimation of nonsquamous NSCLC prognosis.
We retrospectively analyzed the gene expression profiles of frozen tumor tissue samples from 19 public NSCLC cohorts, including 18 microarray data sets and 1 RNA-Seq data set for The Cancer Genome Atlas (TCGA) lung adenocarcinoma cohort. Only patients with nonsquamous NSCLC with clinical annotation were included.35,36 We excluded patients who had a positive surgical margin or had received neoadjuvant therapy, adjuvant chemotherapy, or other pharmaceutical therapy owing to immune-modulating effects of some therapeutics.37 Overlapped patients (n = 19) between the Director’s Challenge Consortium (DCC) and a data set in Gene Expression Omnibus (GSE14814) were removed from the DCC data set. Overall, we included 2414 patients in our study (eTable 1 in the Supplement). Details about sample preparation and RNA measurement can be found elsewhere.7,8,38-52 Preprocessing of gene expression profiles can be found in the eMethods in the Supplement. This study of deidentified data was approved by the institutional review board of Stanford University, Palo Alto, California.
Data were collected from July 22 to September 8, 2016. eFigure 1 in the Supplement shows the overall study design. We selected the 3 largest individual data sets for independent validation, namely, TCGA lung adenocarcinoma (TCGA), DCC, and GSE30219. The remaining 16 microarray data sets were merged into 1 meta–data set, which was randomly split approximately in half into meta-training and meta-testing data sets. Clinical and pathologic characteristics of patients in each data set are shown in eTable 2 in the Supplement.
We constructed a prognostic signature by focusing on immune-related genes (IRGs), which were downloaded from the ImmPort database (https://immport.niaid.nih.gov).53 A variety of IRGs were included, such as cytokines, cytokine receptors, and genes related to the T-cell receptor signaling pathway, B-cell antigen receptor signaling pathway, natural killer cell cytotoxicity, and antigen processing and presentation pathways. Immune-related genes measured by all platforms were selected. The gene expression level in a specific sample or profile underwent pairwise comparison to generate a score for each IRGP. An IRGP score of 1 was assigned if IRG 1 was less than IRG 2; otherwise the IRGP score was 0. This gene pair–based approach has an important advantage because the score is calculated based entirely on the gene expression profile of a tumor sample and can be used in an individualized manner without the need for normalization. Some IRGPs with constant values (0 or 1) in a particular platform or data set may be attributable to (1) platform-dependent preferential measurement, which can cause biases and may not be reproducible across platforms, and (2) biologically preferential transcription, which does not provide discriminative information about patient survivial.54 Therefore, we removed IRGPs with constant values in any individual data set of the meta-data set.
Prognostic IRGPs were selected using the log-rank test to assess the association between each IRGP and patients’ overall survival in the meta-training data set. Prognostic IRGPs with a familywise error rate less than 0.05 were candidates to build the IRGP index (IRGPI). To minimize the risk of overfitting, we applied a Cox proportional hazards regression model combined with the least absolute shrinkage and selection operator (glmnet, version 2.0-5).55 The penalty parameter was estimated by 10-fold cross-validation in the meta-training data set at 1 SE beyond the minimum partial likelihood deviance.55
To separate patients into low- or high-risk groups, the optimal IRGPI cutoff was determined by a time-dependent receiver operating characteristic (ROC) curve (survivalROC, version 1.0.3)56 at 5 years in the meta-training data set. We used the nearest neighbor estimation57 method to estimate the ROC curve. The IRGPI corresponding to the shortest distance between the ROC curve and point representing the 100% true-positive rate and 0% false-positive rate was used as the cutoff value.
The prognostic value of the IRGPI was assessed in patients with all stages of disease and in stage-specific groups (early stages [I/II], I, IA, IB, or II) in the meta-training, meta-testing, and independent validation cohorts in univariate analyses. We then combined IRGPI with available clinical and pathologic variables in multivariate analyses. Age, grade, and stage were treated as continuous variables. Grade was coded as well differentiated (0), moderately differentiated (1), or poorly differentiated (2). Stage IA was coded as 1; the range between IA and IB, as 1.5; IB, as 2; IIA, as 3; the range between IIA and IIB, as 3.5; IIB, as 4; IIIA, as 5; the range between IIIA and IIIB, as 5.5; IIIB, as 6; and IV, as 7. The prognostic accuracy of the biomarkers in continuous form was evaluated using the concordance index (C-index), which ranges from 0 to 1.0, with 0.5 indicating random estimation. We compared the prognostic accuracy of the IRGPI with 2 existing multigene signatures (eMethods in the Supplement) in terms of the C-index.58
To gain biological understanding of the IRGPI, we conducted enrichment analysis of its component IRGs with DAVID (Database for Annotation, Visualization and Integrated Discovery) Bioinformatics Resources (version 6.8; https://david.ncifcrf.gov/).59 The background gene list consists of IRGs measured by all platforms. Biological processes of gene ontology with P < .10 were examined. In the TCGA project, snap-frozen tissue block of the tumor was divided into top, middle, and bottom tissue sections. The middle section was used to derive genomic data, and the top and bottom sections underwent histopathologic examination.60 Thus, in addition to RNA-Seq data, information about immune infiltration by lymphocytes, monocytes, and neutrophils and necrosis percentage is also available for tumor samples in the TCGA data set. We compared those pathologic characteristics between patients in different immune risk groups according to the IRGPI by using the Wilcoxon rank sum test.
Based on the results of multivariate analyses, we integrated age, stage, and IRGPI risk score to a composite immune-clinical prognostic index (ICPI) by applying Cox proportional hazards regression in the meta-training data set. The prognostic performance of continuous ICPI score was compared with that of the IRGPI in terms of C-index and revealed by the restricted mean survival (RMS) curve.61 RMS represents the life expectancy at 10 years for patients with different risk scores. Similar to the aforementioned method for defining the cutoff of IRGPI, the cutoff value for ICPI was estimated by time-dependent ROC curve in the meta-training data set. The performance of binary IRGPI and ICPI was assessed in terms of the RMS time ratio between low- and high-risk groups.62 A higher RMS time ratio corresponds to a larger prognostic difference.
All statistical analyses were performed using R (version 3.3.1; https://www.r-project.org/). Univariate analysis of the association of IRGPI and other clinical pathologic factors with overall survival was evaluated using log-rank test. For factors significantly associated with overall survival in univariate analyses, multivariate analysis was performed with the Cox proportional hazards regression model. The C-index was calculated with survcomp (version 1.22.0) and compared with compareC (version 1.3.1) packages.63 The RMS curve and RMS time ratio were estimated with survival (version 2.41-2) and survRM2 (version 1.0-2) packages.62 Statistical significance was defined as P < .05 unless specified otherwise.
A total of 2414 patients with nonsquamous NSCLC (1205 men [50%], 1111 women [46%], and 98 of unknown sex [4%]; median age [range], 64 [15-90] years) were included in the analysis. Among 1443 IRGs from the ImmPort database, 524 IRGs were measured by all platforms and 137 026 IRGPs were constructed. We removed 126 615 IRGPs (92.4%) with constant ordering in any meta-data sets. The possible cause for constant ordering IRGPs was analyzed in the eResults in the Supplement. No statistically significant difference was observed between meta-training and meta-testing data sets in terms of clinical and pathologic factors (eTable 2 in the Supplement). The association of the 10 411 IRGPs with overall survival was assessed in the meta-training cohort, resulting in 281 prognostic IRGPs. We then constructed an IRGPI consisting of 25 IRGPs by using L1-penalized Cox proportional hazards regression on the meta-training data set. Robustness of the 25 IRGPs against 1000 randomizations of the meta cohort was assessed (eMethods in the Supplement). The 25 IRGPs of the IRGPI were selected at a significantly higher frequency (P < 1.7 × 10−14) than were those by different randomizations (eResults in the Supplement). The IRGPI consisted of 40 unique IRGs, of which 28 were cytokines or cytokine receptors (eTable 3 in the Supplement). On the basis of time-dependent ROC curve analysis, the optimal cutoff for the IRGPI to stratify patients into the high or the low immune risk group was determined to be 0.988 (eFigure 2 in the Supplement).
The IRGPI significantly stratified patients into low- vs high-risk groups in terms of overall survival (eTable 4 and eFigure 3 in the Supplement). It remained as an independent prognostic factor in multivariate analyses, after adjusting for clinical and pathologic factors such as age, sex, smoking status, tumor stage, and tumor grade (hazard ratio [HR] range, 1.72 [95% CI, 1.26-2.33; P < .001] to 2.36 [95% CI, 1.47-3.79; P < .001]) (eTable 4 in the Supplement). Furthermore, the IRGPI stratified patients with early-stage (I and II) nonsquamous NSCLC into significantly different prognostic groups (eFigure 4A-E in the Supplement). When considering patients with stage I disease only, the IRGPI remained highly prognostic for the meta-testing data set (HR, 2.89; 95% CI, 1.95-4.29; P = 3.03 × 10−8) and independent validation cohorts (HR, 2.04; 95% CI, 1.53-2.73; P = 8.87 × 10−7) (Figure 1A and B). When further restricted to patients with stage IA or IB disease, the IRGPI could stratify patients into subgroups with significantly different prognoses in the meta-testing and/or independent validation cohorts (eFigure 5 in the Supplement). Similarly, a higher IRGPI was correlated with significantly worse prognosis patients with stage II disease in meta-testing (HR, 1.73; 95% CI, 1.03-2.92; P = .037) and in independent validation cohorts (HR, 3.39; 95% CI, 1.80-6.37; P = 5.53 × 10-5) (Figure 1C and D) and patients with more advanced stage III to IV disease (HR, 1.56; 95% CI, 1.14-2.15; P = .005) (eFigure 4F in the Supplement). Overall, the IRGPI appears to estimate overall survival indepently of stage in nonsquamous NSCLC (Figure 2).
Enrichment analysis of the 40 unique IRGs identified 6 overrepresented biological processes in gene ontology (eTable 5 in the Supplement). Most biological processes were chemotaxis. Chemotaxis of various immune infiltrates, such as neutrophils, monocytes, and macrophages, was observed. In the TCGA data set, we found that the percentages of necrosis and neutrophil infiltration were significantly different between IRGPI risk groups (Figure 3). The mean level of necrosis in the IRGPI high-risk group was 3-fold that in IRGPI low-risk group in the bottom (6.5% vs 2.1%; P = 2.39 × 10−6) and top (4.6% vs 1.5%; P = 6.29 × 10−13) sections. Patients with a higher IRGPI also had significantly higher neutrophil infiltration in their tumors in the bottom (4.8% vs 1.3%; P = .002) and top (4.3% vs 1.6%; P = .001) sections. However, no statistically significant difference in lymphocyte or monocyte infiltration was observed between the 2 IRGPI risk groups in the TCGA data set.
We compared the IRGPI with 2 clinical applicable and commercialized biomarkers, including a 14-gene biomarker for stages I to III nonsquamous NSCLC and a 31-cell cycle progression (CCP) gene biomarker for early-stage (I and II) NSCLC (eMethods in the Supplement).6,9,64 Only TCGA and GSE13213 included all 14 genes needed to construct the 14-gene biomarker. For both data sets, the IRGPI achieved a higher C-index compared with the 14-gene biomarker (eFigure 6A in the Supplement). Of note, for patients with stages I to III disease in the TCGA data set, the C-index of the IRGPI was significantly different from random estimation (C-index, 0.62; 95% CI, 0.55-0.69; P < .001), whereas the 14-gene biomarker was not (C-index, 0.50; 95% CI, 0.44-0.56; P = .93). Similarly, in 3 of 4 data sets, the IRGPI showed a higher C-index than the CCP biomarker (eFigure 6B in the Supplement). Overall, the mean C-index weighted by cohort size was 0.64 for IRGPI vs 0.53 for the 14-gene biomarker and 0.61 for the CCP biomarker in respective comparisons.
We used the same 1000 random splits of the meta-cohort to further assess the accuracy and variability of survival estimation for different biomarkers (eMethods in the Supplement). The IRGP-based models showed a higher median C-index of 0.65 and lower variation (SD, 0.015) in the meta-testing cohorts compared with CCP and the 14-gene biomarker, with median (SD) C-indexes of 0.57 (0.020) and 0.56 (0.025), respectively.
In multivariate analysis (eTable 4 in the Supplement), age, stage and IRGPI were independent prognostic factors in at least 3 data sets, suggesting their complementary value. To further improve accuracy, we combined age, stage, and IRGPI score to fit a Cox proportional hazards regression model using the meta-training data set and derived an ICPI as (0.0265 × age) + (0.267 × stage) + (1.917 × IRGPI score). An optimal cutoff of 0.116 for stratifying patients was determined based on time-dependent ROC curve analysis in the meta-training data set (eFigure 7A in the Supplement). Significantly improved estimation of survival was achieved by the continuous form of ICPI relative to IRGPI (mean C-index, 0.70 vs 0.63 in the validation data set; P < .005) (eFigure 7B in the Supplement and Figure 4). Similar results were observed in binary form of the ICPI relative to the IRGPI (eFigure 8 and eTable 6 in the Supplement).
We also compared our ICPI with a commercialized clinical-molecular composite biomarker mPS, which integrates stage with CCP score, in patients with early-stage disease (eMethods in the Supplement). Our ICPI achieved a higher accuracy of survival estimation in all validation data sets (mean C-index, 0.68 for ICPI vs 0.65 for mPS; eFigure 9 in the Supplement).
Patients with early-stage NSCLC are at substantial risk for recurrence and death, even after complete surgical resection. The use of adjuvant therapy in early-stage, particularly stage I, NSCLC remains controversial, because previous randomized trials have not demonstrated a consistent survival benefit. Reliable prognostic biomarkers are critically needed to select patients who are at highest risk for recurrence and who might benefit from additional systemic therapy. Significant research on gene expression–based prognostic signatures6-10 has led to recent commercialization of 2 biomarkers in lung adenocarcinoma,6,9 but their accuracy of survival estimation remains limited. In this study, we developed a prognostic signature based on 25 immune-related gene pairs for nonsquamous NSCLC and validated it in multiple independent data sets across different platforms. Our prognostic immune signature can further stratify clinically defined groups of patients (eg, early-stage [I/II] and stages I, IA, IB, and II nonsquamous NSCLC) into subgroups with different survival outcomes. In benchmark comparisons, our signature achieved higher accuracy than 2 commercialized molecular biomarkers. We further leveraged the complementary value of molecular and clinical characteristics and showed that combining both could provide a more accurate estimation of overall survival in nonsquamous NSCLC.
To identify reliable biomarkers of nonsquamous NSCLC prognoses, we combined gene expression profiles from multiple data sets and used methods that are specifically designed to perform robustly given technical biases inherent across different platforms with microarray or RNA-Seq technologies.65,66 Our prognostic signature is based on the relative ranking of gene expression values and only involves pairwise comparison within the gene expression profile of a sample, thus eliminating the need for data normalization. As such, our prognostic signature can serve as an individualized, single-sample estimate of survival of NSCLC and may be readily translated to clinical practice.
Prognostic or predictive biomarkers related to the tumor immune microenvironment may hold great promise for identifying novel molecular targets and improving patient management in the era of immunotherapy.2 Suzuki et al67 discovered that stromal rather than tumor nest FoxP3:CD3 ratio and tumor expression level of certain cytokines (interleukin 12 receptor β2 and interleukin 7 receptor) were correlated with recurrence in stage I lung adenocarcinoma. Because gene expression profiles used in the present study were derived from a core sample of tumor tissue, we did not observe significant differences in lymphocyte infiltration between low- and high-risk IRGPI groups, which is consistent with the previous findings. Similarly, most genes contained in our immune signature were also cytokines and cytokine receptors, which play key roles in chemotaxis, angiogenesis, and inflammatory processes.68 An increased inflammatory microenvironment has been shown to be a consistent component of neoplastic process and tumor progression.68,69 Different from apoptosis, which is immunologically and inflammatorily silent,70 necrosis can release proinflammatory intracellular content into the tumor microenvironment and induce an inflammatory response involving a diverse set of immune cells such as neutrophils and macrophages.71 In addition, tumor-associated neutrophils have been shown to be associated with poor prognoses in a variety of cancer types.19,72-75 We found significantly increased level of necrosis and infiltration of neutrophils consistently in the high immune–risk group in the TCGA data set. On the basis of the aforementioned findings, dysregulated immune contexture might be the reason for the survival differences observed between patient groups as defined by our signature.
Limitations of our study include its retrospective nature, although we tried to include as many data sets as possible for more rigorous validation of our biomarker. Similar to other studies,76,77 gene expression signatures are subject to sampling bias caused by intratumor genetic heterogeneity. Although we excluded IRGPs with constant ordering to reduce certain cross-study batch effects, their complex nature implies that not all batch effects can be addressed, and some may remain.14 Future studies will integrate diverse biological processes, which could provide a more complete molecular picture of the tumor.
The proposed immune-related gene pair–based signature is a promising prognostic biomarker in nonsquamous NSCLC, including early-stage disease. Prospective studies are needed to further validate its analytical accuracy for estimating prognoses and to test its clinical utility in individualized management of nonsquamous NSCLC.
Corresponding Author: Ruijiang Li, PhD, Department of Radiation Oncology, Stanford University School of Medicine, 1070 Arastradero Rd, Palo Alto, CA 94304 (email@example.com).
Accepted for Publication: April 19, 2017.
Published Online: July 6, 2017. doi:10.1001/jamaoncol.2017.1609
Author Contributions: Dr R. Li had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: B. Li, R. Li.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: B. Li, Cui, Diehn.
Obtained funding: R. Li.
Administrative, technical, or material support: B. Li, Cui, R. Li.
Study supervision: Diehn, R. Li.
Conflict of Interest Disclosures: None reported.
Funding/Support: This study was supported in part by grant R01CA193730 from the National Institutes of Health.
Role of the Funder/Sponsor: The sponsor had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.