eTable 1 in the Supplement lists the included studies. PD-L1 indicates programmed cell death ligand 1; PFS, progression-free survival; ORR, objective response rate; OS, overall survival.
A, Fifty-six analyses from the primary literature correlating different biomarker modalities with patient responses after anti–PD-1/PD-L1 therapy were analyzed. The sensitivity and 1−specificity of the assay for each individual publication is shown by a single dot (number on the dot correlates with reference list in eTable 1 in the Supplement). The size of each dot is proportionate to the size of the studied cohort. Linear regression models weighted (B) by the number of patients in each study and unweighted (C) (ie, each study treated equally) were used to generate summary receiver operating characteristic [sROC] curves for each assay modality). The multiplex immunohistochemistry/immunofluorescence (mIHC/IF) has a significantly higher area under the curve (AUC) than PD-L1 (programmed cell death ligand 1) IHC, tumor mutational burden (TMB), and gene expression profiling (GEP) by weighted approach and PD-L1 IHC and TMB by unweighted approach.
aIndicates statistical significance (P < .05), Hanley and McNeil method.
The Figure shows that multimodality biomarkers have an sROC curve comparable to that of multiplex immunohistochemistry/immunofluorescence. AUC indicates area under the curve; GEP, gene expression profiling; IHC, immunohistochemistry; mIHC/IF, multiplex immunohistochemistry/immunofluorescence; PD-L1, programmed cell death ligand 1; TMB, tumor mutational burden.
A, Dots in the upper left quadrant represent studies that reported high negative predictive values and were good at excluding patients from anti–PD-1/PD-L1 treatment. Dots in the upper right quadrant represent studies that reported both high negative and positive predictive values, meaning they are suitable for excluding patients who should not be treated and for selecting patients who will respond. Multiplex immunohistochemistry/immunofluorescence (mIHC/IF) can help both rule in and rule out response to anti–PD-1/PD-L1 therapy. The study number correlating to the individual dots is provided in eFigure 7 in the Supplement. B, Multiplex immunohistochemistry/IF has a better likelihood ratio (LR−) than other tested biomarker approaches, whereas both mIHC/IF and multimodality approaches have significantly higher LRs(+) (eTable 3 in the Supplement).
eTable 1. Included publications
eTable 2. Summary of extracted data from each analysis
eTable 3. Comparison of pooled positive and negative likelihood ratio for responders vs non-responders following anti-PD-(L)1 therapy between different assay modalities
eFigure 1. The clinical co-variates of median patient age, sex, tumor stage and tumor type did not vary between assay modalities
eFigure 2: Summary ROC (sROC) curve analysis by assay modality for responders vs non-responders excluding studies that did not report ORR
eFigure 3: Summary ROC (sROC) curve analysis for each modality with SD alternatively included with responders and non-responders
eFigure 4: Tumor-type specific sROC. (A) NSCLC, (B) Melanoma, and (C) Urothelial carcinoma
eFigure 5: Testing the impact of a uniform cutpoint (upper tertile vs lower two tertiles) on the sROC curve for TMB studies
eFigure 6: Reference points for notable other studies (MSI and TMB+GEP+machine learning) as they relate to the included mIHC/IF studies
eFigure 7. Negative and positive predictive values for each individual study by biomarker assay modality
Customize your JAMA Network experience by selecting one or more topics from the list below.
Lu S, Stein JE, Rimm DL, et al. Comparison of Biomarker Modalities for Predicting Response to PD-1/PD-L1 Checkpoint Blockade: A Systematic Review and Meta-analysis. JAMA Oncol. 2019;5(8):1195–1204. doi:10.1001/jamaoncol.2019.1549
What is the relative diagnostic accuracy of different biomarker assay modalities in predicting clinical response to anti–PD-1/PD-L1 (programmed cell death 1/programmed cell death ligand 1) therapy?
In this systematic review and meta-analysis involving tumor specimens from 8135 patients, multiplex immunohistochemistry/immunofluorescence (mIHC/IF) had significantly higher diagnostic accuracy than PD-L1 IHC, tumor mutational burden, or gene expression profiling in predicting clinical response to anti–PD-1/PD-L1 therapy and was similar to multimodality cross-platform composite approaches, such as PD-L1 IHC + tumor mutational burden.
Multiplex immunohistochemistry/IF facilitates quantification of protein coexpression on immune cell subsets and assessment of their spatial arrangements; initial findings suggest that mIHC/IF has diagnostic accuracy comparable to multimodality cross-platform composite approaches in predicting response to anti–PD-1/PD-L1.
PD-L1 (programmed cell death ligand 1) immunohistochemistry (IHC), tumor mutational burden (TMB), gene expression profiling (GEP), and multiplex immunohistochemistry/immunofluorescence (mIHC/IF) assays have been used to assess pretreatment tumor tissue to predict response to anti–PD-1/PD-L1 therapies. However, the relative diagnostic performance of these modalities has yet to be established.
To compare studies that assessed the diagnostic accuracy of PD-L1 IHC, TMB, GEP, and mIHC/IF in predicting response to anti–PD-1/PD-L1 therapy.
A search of PubMed (from inception to June 2018) and 2013 to 2018 annual meeting abstracts from the American Association for Cancer Research, American Society of Clinical Oncology, European Society for Medical Oncology, and Society for Immunotherapy of Cancer was conducted to identify studies that examined the use of PD-L1 IHC, TMB, GEP, and mIHC/IF assays to determine objective response to anti–PD-1/PD-L1 therapy. For PD-L1 IHC, only clinical trials that resulted in US Food and Drug Administration approval of indications for anti–PD-1/PD-L1 were included. Studies combining more than 1 modality were also included. Preferred Reporting Items for Systematic Reviews and Meta-analysis guidelines were followed. Two reviewers independently extracted the clinical outcomes and test results for each individual study.
Main Outcomes and Measures
Summary receiver operating characteristic (sROC) curves; their associated area under the curve (AUC); and pooled sensitivity, specificity, positive and negative predictive values (PPV, NPV), and positive and negative likelihood ratios (LR+ and LR−) for each assay modality.
Tumor specimens representing over 10 different solid tumor types in 8135 patients were assayed, and the results were correlated with anti–PD-1/PD-L1 response. When each modality was evaluated with sROC curves, mIHC/IF had a significantly higher AUC (0.79) compared with PD-L1 IHC (AUC, 0.65, P < .001), GEP (AUC, 0.65, P = .003), and TMB (AUC, 0.69, P = .049). When multiple different modalities were combined such as PD-L1 IHC and/or GEP + TMB, the AUC drew nearer to that of mIHC/IF (0.74). All modalities demonstrated comparable NPV and LR−, whereas mIHC/IF demonstrated higher PPV (0.63) and LR+ (2.86) than the other approaches.
Conclusions and Relevance
In this meta-analysis, tumor mutational burden, PD-L1 IHC, and GEP demonstrated comparable AUCs in predicting response to anti–PD-1/PD-L1 treatment. Multiplex immunohistochemistry/IF and multimodality biomarker strategies appear to be associated with improved performance over PD-L1 IHC, TMB, or GEP alone. Further studies with mIHC/IF and composite approaches with a larger number of patients will be required to confirm these findings. Additional study is also required to determine the most predictive analyte combinations and to determine whether biomarker modality performance varies by tumor type.
Substantial effort is ongoing to identify predictors of response and resistance to anti–PD-1/PD-L1 (anti–programmed cell death 1/programmed cell death ligand 1) immunotherapy. Expression of PD-L1 protein was the first candidate biomarker associated with response to anti–PD-1 therapy.1,2 Multiple PD-L1 immunohistochemistry (IHC) assays have been approved by the US Food and Drug Administration (FDA) as a companion or complementary diagnostic for patients with non–small cell lung cancer (NSCLC), melanoma, bladder cancer, gastric cancer, and cervical carcinoma to help preselect patients for anti–PD-1/PD-L1 therapy.3,4 These IHC assays have been used to stratify patients in clinical trials. Positive PD-L1 IHC has also recently been shown to enrich for response to combination therapy with anti–PD-1/cytotoxic T lymphocyte antigen-4 in patients with NSCLC.5Quiz Ref ID However, although PD-L1 expression has been shown to correlate with response to therapy in certain tumor types, the association is not absolute. As a diagnostic assay, PD-L1 IHC has several limitations: multiple different assays are available, the predictive significance of tumor cell vs immune cell expression varies by tumor type,6 the scoring of immune cell PD-L1 expression by pathologists has poor interobserver reproducibility,7 and the PD-L1 expression is commonly reduced to a digital readout (+ vs −) without assessing its expression in the greater context of the tumor microenvironment (TME) (eg, association with immune cells suggests an adaptive pattern of expression).8,9
More recent biomarker approaches include assessing tumor mutational burden (TMB), gene expression profiling (GEP), and quantifying multiple proteins using multiplex IHC/immunofluorescence (mIHC/IF). The assessment of TMB as a biomarker for response sensitivity to immunotherapy is predicated on the concept that more mutations yield more T cell–recognized tumor neoantigens, potentially resulting in stronger antitumor immune responses when the PD-1 checkpoint is blocked.10 Tumor mutational burden was first shown to be associated with response to cytotoxic T lymphocyte antigen-4 blockade in patients with melanoma,11 followed by patients with NSCLC treated with anti–PD-1 therapy (pembrolizumab),12 and now extends to numerous solid tumor types.13
Gene expression profiling allows for the simultaneous assessment of a number of parameters. The mRNA transcript levels of inflammatory genes, immune checkpoint genes, and even oncogenes have been included in various gene panels. This approach has a continuous output and has been used to develop “response signatures” for several tumor types, most of which include an interferon (IFN) gamma gene signature as a major pillar of the assay.14 However, this approach lacks information on cellular coexpression and geography within the TME. Quiz Ref IDIn contrast, mIHC/IF allows for the simultaneous visualization of multiple IHC/IF protein markers in situ on the same tissue section. As such, it provides a spatial component that can be used to generate cell density metrics for a given tissue region or to assess the distance between 2 given cell types. Coexpression of multiple markers on a single cell can also be readily visualized. Similar to PD-L1 IHC, TMB, and GEP, mIHC/IF assays have now been associated with response to anti–PD-1/PD-L1 therapies in multiple different tumor types.9,15-17
The purpose of the current analysis was to compare existing data on the diagnostic accuracy of PD-L1 IHC, TMB, GEP, and mIHC/IF biomarker modalities in predicting response to anti–PD-1/PD-L1 therapy. We performed a meta-analysis using summary receiver operating characteristic (sROC) curves to determine the relative area under the curves (AUCs) as a global metric of each approach’s ability to discriminate between responders and nonresponders to therapy. We also determined the relative sensitivity, specificity, predictive values, and likelihood ratios of these emerging approaches and compared them with PD-L1 IHC and each other.
This meta-analysis was conducted in adherence to the Preferred Reporting Items for Systematic Reviews and Meta-analysis (PRISMA) statement.18 Two independent reviewers (S.L. and J.E.S.) performed the literature search, assessed eligibility criteria, and performed data extraction (S.L. and D.W.W.).
The DailyMed website19 was used to identify clinical trials that tested PD-L1 status for an association with response to therapy using IHC and were cited in association with FDA–approved indications for nivolumab, pembrolizumab, atezolizumab, durvalumab, and avelumab monotherapies. The national clinical trial number was gathered and PubMed was searched to identify the associated clinical trial report.
For all other assay modalities, predefined search criteria were used to conduct electronic searches of PubMed (from inception to June 2018). Searches were limited to human studies with English translation available. The search syntax included the following terms: (mutational burden OR mutational load OR mutational density OR mutational landscape OR genomic landscape OR whole exome sequencing OR gene expression profiling OR gene signature OR mRNA OR multiplex immunofluorescence OR multiplex immunohistochemistry OR spatial profiling) AND (anti-PD-1 OR anti-PD-L1 OR nivolumab OR pembrolizumab OR atezolizumab OR durvalumab OR avelumab OR BMS-936558 OR BMS-936559 OR MK-3475 OR MPDL3280A OR MEDI4736 OR MSB0010718C). In addition, the 2013 to 2018 annual meeting abstracts from the American Society of Clinical Oncology, the European Society for Medical Oncology, the Society for Immunotherapy of Cancer, and the American Association for Cancer Research were searched using the keywords “nivolumab,” “pembrolizumab,” “atezolizumab,” “durvalumab,” and “avelumab.” A second search was performed in November 2018 for the definitive manuscript related to abstracts identified during the June 2018 search, and the extracted data was updated to reflect the final publication. Several experts in the field were also surveyed to determine if there were any additional publications or conference abstracts that were not revealed by the electronic search.
Studies that correlated pretreatment tissue-based biomarkers of interest with objective response rate (ORR) (ie, complete response and partial response), progression free survival (PFS), or overall survival (OS) were included if they had at least 15 patients treated with anti-PD-1/PD-L1 monotherapy. Tumor mutational burden, GEP, and mIHC/IF studies that enrolled patients based on PD-L1 status were excluded from all of the single-modality assay approaches but were included in the multimodality group if they met all other criteria. Studies involving patients with hematologic cancers and flow cytometry studies on tumor lysates were also excluded. When studies were identified with overlapping participants, the study with the largest tested population was included.
The following data categories were extracted from the included studies: study name, national clinical trial number, therapy received, biomarkers tested, number of patients tested with each biomarker, and the year of publication or conference presentation. Biomarker results (+ vs −) as they related to objective response to therapy were extracted for each study and used to generate a 2 by 2 contingency table showing true-positive, false-positive, true-negative, and false-negative test results for each study. These values were used to calculate sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR+), and negative likelihood ratio (LR−).
In 7 of 56 analyses, ORR was not provided, and OS was gathered.17,20-23 In 2 instances, OS was not available, and PFS was used.24,25 When using OS or PFS, sensitivity and specificity values were calculated at 6 months, 1 year, and 2 years. The Youden statistic (a measure of informedness) was calculated to determine the optimal time point for each individual study that maximized sensitivity and specificity. This point was then included in the meta-analysis to allow the best possible representation for each study. For PD-L1 IHC studies, if multiple thresholds of assay positivity were presented, the FDA-approved cutoff point for the associated companion/complimentary diagnostic for that indication was used. For 1 study in which multiple cutoff points were presented, there was no accompanying PD-L1 IHC diagnostic for that indication. In that instance, the Youden statistic was also performed, and the best performing threshold was chosen.
The calculated sensitivity and specificity from each individual biomarker analysis were plotted, and a curve was fit to the points using both weighted and unweighted linear regression models. For the weighted approach, the hierarchical DerSimonian and Laird method was used; the Moses-Littenberg model was applied for the unweighted approach. The sROC curves and the resultant AUC were used to measure the association between the different assay modalities and ORR. The AUCs were compared for statistically significant differences using the Hanley and McNeil method.26 When TMB raw data were available, the upper tertile of TMB vs the lower 2 tertiles within a given study were evaluated against response.24,27
For each modality, the measures of diagnostic accuracy were pooled to generate an overall metric allowing for comparisons between each modality (eg, pooled sensitivity for PD-L1 IHC vs pooled sensitivity for GEP). Because methodological heterogeneity between included studies was anticipated, a random-effects model was used for pooling the analyzed parameters. Possible publication bias was assessed by examining a funnel plot of the effect size for each study against the reciprocal of its standard error (metaphor package in R28).
The dedicated meta-analysis software Meta-DiSc was used for evaluation of the various biomarker tests.29 All statistical tests were 2-sided and P < .05 was considered statistically significant. A Bonferroni correction was applied to account for multiple comparisons.
The search strategy identified 45 eligible reports that assayed either PD-L1 IHC (n = 24),20,30-51 TMB (n = 10),12,13,16,17,21,22,52-56 GEP (n = 9),13,21-23,40,54,57-59 mIHC/IF (n = 7),9,15-17,25,60,61 or multimodality (PD-L1 IHC + TMB and/or GEP, n = 6)13,24,52,59 and correlated the results with response to anti–PD-1/PD-L1 therapy (Figure 1 and eTable 1 in the Supplement). Eleven studies reported either multiple different clinical study cohorts or more than 1 of the individual biomarker approaches, resulting in a total of 56 individual analyses. eTable 2 in the Supplement provides a summary of study size, median patient age, sex, tumor stage, tumor type, treatment, ORR, and assay performance characteristics for each trial. Clinical covariates such as age, sex, tumor stage, and tumor type did not vary between assay modalities (eFigure 1 in the Supplement). In total, specimens from 8135 patients with more than 10 different solid tumor types are represented in the meta-analysis.
The derived sensitivity and 1-specificity values from each report were plotted, Figure 2A. The points were fitted with sROC curves using 2 different approaches: 1 that weighted each study by the number of patients enrolled, and 1 in which each study was given equal weight. Quiz Ref IDBy weighted sROC curves, mIHC/IF had a significantly higher AUC (0.79) than PD-L1 IHC (0.65, P < .001), GEP (0.65, P = .003), and TMB (0.69, P = .049) (Figure 2B). Neither TMB nor GEP had an AUC profile that significantly differed from that of PD-L1 IHC. Most of the GEP studies included an IFN gamma gene signature. However, the broad category of GEP as reported herein also includes reports of other gene signatures associated with therapeutic resistance (Nos. 35 and 42 in eTable 1 in the Supplement). When those reports were excluded, the sROC curve AUC for only IFN gamma–based GEP studies was 0.65. When each reported study was given equal weight, irrespective of the number of patients enrolled, the relative AUC results remained consistent (Figure 2C). Similarly, when the 9 (16%) of 56 analyses17,20-25 with ORR imputed from OS/PFS data were excluded from the analysis, the AUC results remained consistent (eFigure 2 in the Supplement).
Approaches that combined variables across multiple platforms were also explored. The sROC curves from the multimodality biomarkers approached the AUC of mIHC/IF (0.74 vs 0.79, P = .48) (Figure 3). This supports earlier findings showing that measures of an “inflamed” TME combined with TMB have additive prognostic or PV over either parameter alone.13,62
Of 47 studies that reported ORR, 45 (96%) included stable disease (SD), with progressive disease (PD) reported under the heading of “nonresponders.”9,12,13,15,30-61 Of the 18 studies that reported SD as a separate category that could be reanalyzed,9,12,36,38-41,44,47,49,51,52,55,56,58,61 we compared the sROC curves for each modality when SD was included with responders rather than with PD, and the resulting AUCs did not noticeably differ (eFigure 3 in the Supplement). Subgroup analysis by tumor type was also performed (eFigure 4 in the Supplement), but a larger number of studies will be necessary to resolve any potential differences in assay modality performance by tumor type.
Subgroup analysis was also performed on the study cohorts that assessed for TMB to explore the influence of various approaches used for thresholding cases into high TMB vs low TMB categories. The original analysis was performed using the threshold of positivity set by each individual study. For the 6 of 10 studies that had raw data available, we used a uniform thresholding approach, whereby the upper tertile of TMB was considered a positive test and the remainder was considered a negative test.12,17,52,54-56 In this subset of cases, the profiles of the sROC curves for TMB were not noticeably affected by the use of a uniform approach, with AUCs of 0.71 in both analyses (eFigure 5 in the Supplement).
Additionally, although studies focused on microsatellite instability (MSI) were not included in this meta-analysis, a study reporting MSI as a predictor of response to anti–PD-1/PD-L1 therapy63 is included alongside the mIHC/IF data points and sROC curves for comparison (eFigure 6 in the Supplement) and demonstrates that mIHC/IF assays may have sensitivity and specificity similar to that of MSI status when predicting response to anti–PD-1/PD-L1 therapy. Similarly, a study that included a machine learning component for biomarker discovery using TMB and GEP data64 is shown in eFigure 6 in the Supplement for interest. The high sensitivity and specificity achieved in that study suggests that machine learning algorithms may further improve diagnostic accuracy.
The PPV and NPV for each study are plotted in Figure 4A and eFigure 7 in the Supplement. Most modalities provide relatively high NPV. PD-L1 immunohistochemistry, GEP, and multimodality biomarkers have relatively low PPV. In contrast, mIHC/IF assays consistently demonstrate a high PPV (Table). The PPV and NPV plots for NSCLC, melanoma, and urothelial carcinoma are shown separately in eFigure 4 in the Supplement.
The pooled LRs for each modality are shown in Figure 4B and eTable 3 in the Supplement. Multiplex immunohistochemistry/IF and the multimodality biomarkers have LRs+ that are significantly higher than those of PD-L1 IHC (2.86 and 2.76, respectively, vs 1.51), indicating that they are less likely to produce a false-positive result. The pooled LRs− for GEP (0.65), TMB (0.62), PD-L1 IHC (0.69), and multimodality approaches (0.60) were essentially equivalent. The mIHC/IF studies demonstrated a trend toward an improved LR− compared with the other modalities, although this difference was not statistically significant.
The pooled sensitivity and specificity for each modality are summarized in the Table. Quiz Ref IDGene expression profiling and mIHC/IF have improved sensitivity compared with that of PD-L1 IHC, whereas TMB, mIHC/IF, and multimodality have improved specificity. Gene expression profiling actually has a lower specificity for response than PD-L1 IHC. It is not immediately clear why this is the case, especially because many of the GEP panels detect PD-L1 RNA. Possible explanations include a lack of correlation between RNA and protein levels for the markers of interest or the fact that RNA is less stable than protein and thus is harder to preserve and later detect as an analyte. In addition, protein-based measurements are commonly associated with a higher dynamic range than RNA analytes.65
Because there are relatively fewer publications with small cohort sizes for the newer modalities compared with those for PD-L1 IHC, a funnel plot was used to assess potential publication bias. Bias appeared to be trivial, with no effect on major findings (data available from the authors).66
Pretreatment predictive biomarkers for immuno-oncology are sought after to help match individual patients to the treatment regimen most likely to be of benefit. Predictive biomarkers may also accelerate clinical trials and FDA approvals, aid in cost containment, and help members of the medical community provide accurate patient guidance. Currently, a number of different biomarker modalities are being pursued. The most common tissue-based biomarker approaches for predicting response to anti–PD-1/PD-L1 therapy include PD-L1 IHC, TMB, GEP, and mIHC/IF. This meta-analysis provides an overview of the relative diagnostic accuracy of these approaches. We found that mIHC/IF has a significantly higher AUC than PD-L1 IHC, GEP, and TMB and has improved LR+ and LR− compared with the other biomarker approaches. PD-L1 IHC, the most well-established biomarker for response to anti–PD-1/PD-L1 therapy, has one of the lowest AUCs and has generally poor LRs.
The AUC serves as a global measure of how well an assay can distinguish between 2 groups—in this instance—responders and nonresponders to anti–PD-1/PD-L1 therapy. Consensus guidelines regarding acceptable AUCs for companion and complimentary diagnostic tests do not exist. However, some authors have suggested that diagnostic tests used for patient selection should have AUCs of 0.80 or higher.67-69 In this study, we demonstrate that mIHC/IF has an AUC in this range and that combining different biomarker modalities approaches can also result in an AUC that approaches this target threshold.
It is possible that 2 biomarker approaches could demonstrate near equivalent AUCs, yet have differing sensitivities and specificities. Additional measures of diagnostic accuracy are useful for further assessing how modalities perform at ruling in vs ruling out a patient for a given therapy. Quiz Ref IDMost approaches had a similar ability to identify patients who were not likely to respond to therapy, as indicated by comparable NPVs and LRs−. The notable difference was in identifying patients most likely to respond to therapy. In this regard, mIHC/IF had a significantly higher PPV and LR+. This means that there are fewer false-positive tests, that is, patients who would be treated but would not respond to therapy. This translates practically to being able to match a patient to a potentially more effective treatment sooner as well as not exposing a patient to potential immune-related adverse effects if they are less likely to respond. Furthermore, because an average treatment course typically costs more than $120 000, using biomarker strategies with improved diagnostic accuracy may help avoid considerable costs to the health care system for much less likely anticipated benefit.
The improved performance of mIHC/IF cannot simply be attributed to a larger number of analytes studied because the GEP methods assayed a median of 17 analytes vs 2 or 3 for mIHC/IF. Instead, these findings suggest improved diagnostic benefit when spatial relationships and protein coexpression on specific cellular subsets are assessed. The studies of mIHC/IF assays included herein evaluated parameters such as PD-1 to PD-L1 proximity (n = 2 studies15,16), CD8+ cell density within a specific TME compartment (eg, intratumoral/peritumoral defined by IHC/IF tumor marker, n = 39,60,61), or coexpressed markers indicating T-cell activation (n = 217,25). Spatial and coexpression assessments are not represented in GEP assays where specimens were homogenized before being assayed. There are emerging high multiplex protein detection approaches, including multispectral platforms,15,70,71 serial stain and strip,72,73 imaging mass cytometry,74,75 and multiplexed ion beam imaging.76,77 Although it is anticipated that the inclusion of additional markers will yield improved PV, large data sets for such comparisons are not yet available.
Many of the modalities test for parameters representing an “inflamed” TME. The aforementioned mIHC/IF assays characterize T-cell activation states, immune checkpoint expression, and/or the density of specific T-cell subsets. Many GEP studies contain an IFN gamma gene signature, and PD-L1 IHC expression may be driven by IFN gamma following T-cell infiltration.8,78 In contrast, TMB does not necessarily correlate with an “inflamed” TME62,79 but represents the possibility of there being immunogenic mutation-associated neoantigens present within the tumor. An analysis performed by Danilova et al62 on melanoma specimens from The Cancer Genome Atlas data set showed that TMB and an inflamed phenotype were distinct variables when predicting patient survival and that TMB specifically had added prognostic value when tumors were less inflamed. The concept that TMB and a marker of an inflamed TME have potential additive value is supported in the present study by the fact that combined biomarker approaches such as PD-L1 IHC + TMB demonstrated a higher AUC and LR+ compared with that of PD-L1 IHC, TMB, or GEP alone. Importantly, the combination of mIHC/IF + TMB has yet to be tested to determine whether the incorporation of TMB can further elevate the mIHC/IF AUC.
The limitations of this study include the fact that there were different assays represented within a single-assay type category. For example, the studies of PD-L1 IHC included herein used different companion diagnostic assays with different PD-L1 antibodies, thresholds of positivity, and scoring systems. Similarly, the mIHC/IF assays tested different protein targets using different platforms as described earlier, and the number of detected mRNA targets ranged from 1 (IFN gamma) to 26 different gene sets representing about 690 individual genes. In contrast, TMB arguably represents a more uniform output. Furthermore, the patient outcome tested in this study was ORR. Although it is recognized that ORR correlates with survival,80 it would be of great interest to specifically test for an association between these assay modalities and PFS and/or OS; however, such long-term follow-up is not yet available for most studies. Finally, it is a significant limitation that mIHC/IF is the newest method of those tested. Even though the analysis was weighted by specimen numbers, the mIHC/IF test numbers used in the study were less than 10% of PD-L1 test numbers, thus warranting caution in the interpretation of this modality. However, given the early promise of mIHC/IF, we look forward to broader future testing.
The increasing number of published studies reporting on predictive biomarkers for response to anti–PD-1/PD-L1 therapy has provided an opportunity to assess the accuracy of individual assay modalities. This meta-analysis suggests that mIHC/IF merits further investigation. The relative success of mIHC/IF in predicting patient response also provides insight into the spatial importance of tumor-immune interactions and the contribution of protein marker coexpression. Importantly, this difference was achieved with a relatively “low plex,” that is, an average of 2 to 3 markers examined. Future improvements in diagnostic accuracy are likely to be made by increasing the number of markers detected in the mIHC/IF format and by developing multiplex, multimodal approaches that potentially combine GEP and/or TMB with mIHC/IF.
Accepted for Publication: March 15, 2019.
Corresponding Author: Janis M. Taube, MD, Division of Dermatopathology Johns Hopkins University, 600 N Wolfe St, Blalock Building Room 907, Baltimore, MD 21287 (firstname.lastname@example.org).
Published Online: July 18, 2019. doi:10.1001/jamaoncol.2019.1549
Author Contributions: Dr Taube and Mr Lu had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Lu, Rimm, Hoyt, Pardoll, Taube.
Acquisition, analysis, or interpretation of data: Lu, Stein, D. Wang, Bell, Johnson, Sosman, Schalper, Anders, H. Wang, Hoyt, Danilova, Taube.
Drafting of the manuscript: Lu, Stein, D. Wang, Johnson, Sosman, Danilova, Taube.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Lu, Stein, D. Wang, H. Wang, Danilova.
Administrative, technical, or material support: Lu, Stein, D. Wang, Schalper, Anders, Taube.
Study supervision: Rimm, Hoyt, Pardoll, Danilova, Taube.
Conflict of Interest Disclosures: Dr Rimm reports personal fees from and serves on the advisory board of Amgen, personal fees from Bristol-Myers Squibb, Merck, GlaxoSmithKline, Daiichi Sankyo, Konica Minolta, personal fees from and serves on the advisory board of Cell Signaling Technology, grants and personal fees from Cepheid, AstraZeneca, NextCure, Ultivue, Ventana, Perkin Elmer, grants from Lilly, patents including AQUA software licensing and Navigate Biopharma (Yale owned patent). Dr Johnson serves on the advisory board of Array Biopharma, Bristol-Myers Squibb, Genoptix, Incyte, Merck, and Novartis; receives grant funding from Bristol-Myers Squibb and Incyte; patent pending for using MHC-II as a biomarker for immunotherapy responses. Dr Schalper reports grant funding from Navigate Biopharma, Vasculox, Tesaro, Takeda, Surface Oncology, and Bristol-Myers Squibb; receives grant funding and consulting fees from Celgene, Shattuck Labs, Pierre Fabre, Moderna Therapeutics, AstraZeneca, AbbVie, and Merck; and receives speaking fees from Merck and Fluidigm. Dr Anders receives grant funding from FLX Bio and Five Prime Therapeutics, and is a consultant for Bristol-Myers Squibb, Merck, and AstraZeneca. Mr Hoyt is employed by Akoya Biosciences and owns Akoya Biosciences stock and stock options. Dr. Pardoll reported other support from Aduro Biotech, Amgen, Bayer, Camden Partners, DNAtrix, Dracen, Dynavax, Five Prime, FLX Bio, Immunomic, Janssen, Merck, Rock Springs Capital, Potenza, Tizona, Trieza, and WindMil during the conduct of the study; grants from Astra Zeneca, Medimmune/Amplimmune, and Compugen; grants and other support from ERvaxx and Potenza. Dr Taube reports nonfinancial support from Akoya during the conduct of the study; grants and personal fees from Bristol-Myers Squibb, personal fees from Merck, Astra Zeneca, and Amgen outside the submitted work; equipment and reagents from Akoya Biosciences, and a patent pending related to image processing of mIF/IHC images. No other disclosures were reported.
Funding/Support: This work was supported by the Melanoma Research Alliance (Dr Taube); Harry J. Lloyd Trust (Dr Taube); the Emerson Collective (Dr Taube); Moving for Melanoma of Delaware (Dr Taube); Bristol-Myers Squibb (Drs Taube, Stein, Pardoll, and Ms Wang); Navigate BioPharma (Dr Rimm); Sidney Kimmel Cancer Center Core Grant P30 CA006973 (Drs Taube and Danilova); Yale Cancer Center P30 CA016359 (Dr Rimm); National Institutes of Health (NIH) Lung SPORE in Lung Cancer P50CA196530 (Drs Rimm and Schalper); Department of Defense Lung Cancer Research Program award W81XWH-16-1-0160 (Dr Schalper); Stand Up To Cancer/AACR SU2C-AACR-DT17-15 SU2C-AACR-DT22-17.ACS (Dr Schalper); Melanoma Professorship No. RP-14-246-06 (Dr Sosman); National Cancer Institute R01 CA142779 (Drs Taube and Pardoll); NIH T32 CA193145 (Dr Stein); P50 CA062924 (Dr Anders); K23 CA204726 (Dr Johnson); and The Bloomberg~Kimmel Institute for Cancer Immunotherapy.
Role of Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, and approval of the manuscript; or decision to submit the manuscript for publication.
Additional Contributions: The authors would like to acknowledge Matthew Hellmann, MD (Memorial Sloan Kettering Cancer Center), Evan Lipson, MD, and Suzanne L. Topalian, MD (both Johns Hopkins University), and Robin Edwards, MD (Bristol-Myers Squibb), for helpful discussions. These contributions were not compensated.
Create a personal account or sign in to: