Gene expression heat map with each column representing the difference in expression between subpopulations enriched with leukemic stem cells (LSCs) or leukemia progenitor cells (LPCs) isolated from the same patient with acute myeloid leukemia.12,13 Hs indicates LSC/LPC profile purified from primary human patient specimen; Mm, corresponding samples from mouse xenografts. A total of 52 unique genes were identified as differentially expressed between LSC and LPC at a 10% false discovery rate (eTable 2), with red indicating higher expression in LSC.
Analysis of 17 119 genes (eTable 2 for gene set definitions). Vertical bars in each of the 6 rows represent genes from each of the indicated gene sets. All nominal P values were less than .001. AML indicates acute myeloid leukemia; FDR, false discovery rate; NES, normalized enrichment score.
All of the data are from Stanford cases.12 The boxes span the interquartile range with the median depicted by the thick horizontal bar. Each circle indicates 1 sample. Wilcoxon rank sum test P <.002 for LSC compared with LPC and leukemic blast cells (BLAST). Wilcoxon rank sum test P <.001 for HSC compared with multipotent progenitor (MPP), common myeloid progenitor (CMP), granulocyte-monocyte progenitor (GMP), and megakaryocyte-erythrocyte progenitor (MEP) cells. AML indicates acute myeloid leukemia; LPC, leukemic progenitor cell. Error bars indicate full range.
Stratification of outcomes using this approach is depicted for overall survival in the training set16 and in one of the validation sets and event-free survival in one of the validation sets.17 Vertical ticks on curves indicate censored events. Similar results were obtained in the other independent data sets (Table 2 and eFigure 4).
The LSC score was significantly associated with initial therapeutic response as determined by the ability to achieve clinical remission in 2 data sets for which this information was available"> Y-axis shown in blue indicates range of LSC score from 10 to 20. Error bars indicate full range.
Customize your JAMA Network experience by selecting one or more topics from the list below.
Gentles AJ, Plevritis SK, Majeti R, Alizadeh AA. Association of a Leukemic Stem Cell Gene Expression Signature With Clinical Outcomes in Acute Myeloid Leukemia. JAMA. 2010;304(24):2706–2715. doi:10.1001/jama.2010.1862
Author Affiliations: Department of Radiology, Lucas Center for MR Spectroscopy and Imaging (Drs Gentles and Plevritis), and Department of Internal Medicine, Divisions of Oncology (Dr Alizadeh) and Hematology (Drs Majeti and Alizadeh), Cancer Center and Institute for Stem Cell Biology and Regenerative Medicine (Drs Majeti and Alizadeh), School of Medicine, Stanford University, Palo Alto, California.
Context In many cancers, specific subpopulations of cells appear to be uniquely capable of initiating and maintaining tumors. The strongest support for this cancer stem cell model comes from transplantation assays in immunodeficient mice, which indicate that human acute myeloid leukemia (AML) is driven by self-renewing leukemic stem cells (LSCs). This model has significant implications for the development of novel therapies, but its clinical relevance has yet to be determined.
Objective To identify an LSC gene expression signature and test its association with clinical outcomes in AML.
Design, Setting, and Patients Retrospective study of global gene expression (microarray) profiles of LSC-enriched subpopulations from primary AML and normal patient samples, which were obtained at a US medical center between April 2005 and July 2007, and validation data sets of global transcriptional profiles of AML tumors from 4 independent cohorts (n = 1047).
Main Outcome Measures Identification of genes discriminating LSC-enriched populations from other subpopulations in AML tumors; and association of LSC-specific genes with overall, event-free, and relapse-free survival and with therapeutic response.
Results Expression levels of 52 genes distinguished LSC-enriched populations from other subpopulations in cell-sorted AML samples. An LSC score summarizing expression of these genes in bulk primary AML tumor samples was associated with clinical outcomes in the 4 independent patient cohorts. High LSC scores were associated with worse overall, event-free, and relapse-free survival among patients with either normal karyotypes or chromosomal abnormalities. For the largest cohort of patients with normal karyotypes (n = 163), the LSC score was significantly associated with overall survival as a continuous variable (hazard ratio [HR], 1.15; 95% confidence interval [CI], 1.08-1.22; log-likelihood P <.001). The absolute risk of death by 3 years was 57% (95% CI, 43%-67%) for the low LSC score group compared with 78% (95% CI, 66%-86%) for the high LSC score group (HR, 1.9 [95% CI, 1.3-2.7]; log-rank P = .002). In another cohort with available data on event-free survival for 70 patients with normal karyotypes, the risk of an event by 3 years was 48% (95% CI, 27%-63%) in the low LSC score group vs 81% (95% CI, 60%-91%) in the high LSC score group (HR, 2.4 [95% CI, 1.3-4.5]; log-rank P = .006). In multivariate Cox regression including age, mutations in FLT3 and NPM1, and cytogenetic abnormalities, the HRs for LSC score in the 3 cohorts with data on all variables were 1.07 (95% CI, 1.01-1.13; P = .02), 1.10 (95% CI, 1.03-1.17; P = .005), and 1.17 (95% CI, 1.05-1.30; P = .005).
Conclusion High expression of an LSC gene signature is independently associated with adverse outcomes in patients with AML.
Acute myeloid leukemia (AML) is an aggressive malignancy of the bone marrow characterized by accumulation of early myeloid blood cells that fail to mature and differentiate. The course of the disease is marked by poor prognosis, frequent relapse, and high disease–related mortality.1,2 Recent clinical investigation has focused on the identification of prognostic subgroups in adult AML with the goal of guiding patients into risk-adapted therapies. Such investigation determined that cytogenetic abnormalities are prognostic, some favorable and others unfavorable,3,4 yet up to 50% of patients have normal karyotype AML with a wide range of clinical outcomes. In these patients, the presence of specific molecular mutations can provide prognostic information, including internal tandem duplications within the FLT3 gene, partial tandem duplication of the MLL gene, mislocalizing mutations of the NPM1 gene, mutations in the CEBPA and RAS genes, and increased expression of the BAALC and ERG genes.5,6 However, these parameters and others such as patient age are only partially successful at capturing risk of relapse and patient outcomes following treatment.
A growing body of evidence suggests that specific cancer cell subpopulations possess the ability to initiate and maintain tumors.7,8 Acute myeloid leukemia is the paradigm for which this cancer stem cell hypothesis has been advanced, and this model has major implications for the development of novel therapeutic agents.9 There is significant experimental evidence indicating that AML is organized as a hierarchy of malignant cells initiated and maintained by self-renewing leukemic stem cells (LSCs) that comprise a subset of the total leukemic burden (eFigure 1).7,10 These LSCs are enriched in the fraction of cells with positive CD34 and negative CD38 expression (herein referred to as the LSC-enriched subpopulation), and in turn give rise to leukemic progenitor cells (LPCs) positive for CD34 and CD38, which further differentiate into the negative CD34 leukemic blast population.10,11 A major implication of this cancer stem cell model is that the LSCs must be eliminated to eradicate the cancer and cure the patient.7,8 While AML was the first human malignancy for which this model gained experimental support, its clinical significance has yet to be fully established.
We hypothesized that if the cancer stem cell model accurately reflects the biology of human AML, then patients with LSC enrichment should have worse clinical outcomes, even when accounting for known prognostic parameters, and that this association could be quantified by global gene expression profiling of bulk AML samples.
Seven human AML tumor samples were obtained at the Stanford University Medical Center (Palo Alto, California) between April 2005 and July 2007, according to an approved protocol of the institutional review board after informed consent. Normal human bone marrow mononuclear cells were purchased from AllCells Inc (Emeryville, California), and human cord blood was obtained from Stanford University Medical Center. Normal and leukemic subpopulations were purified from peripheral blood and/or bone marrow by fluorescence-activated cell sorting using the antibodies shown in eFigure 1 as follows: AML LSCs (n = 7), AML LPCs (n = 7), AML blasts (n = 7), normal hematopoietic stem cells (HSCs) (bone marrow and cord blood, n = 7), normal multipotent progenitors (bone marrow and cord blood, n = 8), normal common myeloid progenitors (bone marrow, n = 4), normal granulocyte-monocyte progenitors (bone marrow, n = 4), and megakaryocyte-erythrocyte progenitors (bone marrow, n = 4). Global transcriptional profiles were generated for each sample using Affymetrix U133 Plus 2.0 gene expression microarrays (Affymetrix, Santa Clara, California). Raw data were deposited at the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus GSE24006. The detailed experimental procedures for the purification of cell subpopulations have been reported previously.12
Global gene expression profiles of 14 paired LSC-enriched and LPC-enriched subpopulations purified from the 7 AML patient samples from Stanford University Medical Center were combined with 16 paired profiles (8 LSC-enriched and 8 LPC-enriched subpopulations) from an independent study13 to produce 1 data set of 30 samples for these analyses. All genes profiled on microarrays were ranked by the mean ratio of their expression between paired LSC-enriched and LPC-enriched subpopulations, and evaluated using Gene Set Enrichment Analysis.14 This approach assessed whether any predefined groups of biologically related genes were concordantly more highly expressed in LSC-enriched or LPC-enriched subpopulations. Individual genes expressed more highly in LSC-enriched subpopulations compared with LPC-enriched subpopulations (or vice versa) were identified using Significance Analysis of Microarrays15 (false discovery rate <10%). Ingenuity Pathways Analysis was used to identify interaction networks involving these genes.
The genes that were more highly expressed in the LSC-enriched subpopulations relative to the LPC-enriched subpopulations were summarized by an LSC signature . The LSC signature was defined as a single number representing the relative expression of the LSC-enriched genes in a given sample compared with the other samples in the same data set. The signature was computed as the first principal component of the gene expression data matrix whose rows were the LSC-enriched genes. Each column of the matrix represented 1 sample, and matrix entries were the gene expression values corresponding to each gene in each sample. By definition, the first principal component of such a data matrix is the weighted sum of the genes' expression levels that explains the maximum possible amount of their total variation across all samples. Thus, for a group of n genes, each sample S is associated with 1 number: (LSC signature) S = aigis + a2g2S + . . . + angnS, where giS is the expression level of gene i in sample S, and is weighted by ai in the summation. The LSC signature was evaluated in the purified normal and leukemic subpopulations to investigate the expression of the LSC-enriched genes beyond the LSC-enriched and LPC-enriched subpopulations used to identify them.
To test associations between the LSC-enriched genes and clinical outcomes, a retrospective training–validation scheme was adopted. Raw microarray data were obtained for 4 publicly available bulk AML gene expression studies with available clinical annotations16-20 from the NCBI Gene Expression Omnibus GSE12417 (n = 163 normal karyotype AML only with overall survival outcomes); GSE10358 (n = 184 with overall survival and event-free survival known for 178 cases); GSE14468 (n = 526 with overall survival and event-free survival known for 262 cases and relapse-free survival known for 213 cases) and from the National Cancer Institute caArray database (willm-00119 [https://array.nci.nih.gov/caarray/project/willm-00119], n = 170 with overall survival only). The largest normal karyotype AML data set16 (n = 163) was used as the training set and the other 3 data sets were used for validation. The LSC signature was calculated in the training cohort and the same weights were then applied to the test cohorts. Because the gene weights were not recomputed in each test cohort to ensure unbiased validation, the resulting number for each sample was referred to as the LSC score. In the training set, the LSC signature and LSC score were identical by definition. The median LSC score in the training set was used to partition patients in all cohorts into high vs low LSC score groups.
The LSC score was tested for associations with survival outcomes as a continuous variable using Cox proportional hazards regression (log-likelihood test) and as a dichotomous stratification (high vs low LSC score) by Kaplan-Meier analysis (log-rank test) using R version 2.11 with survival package 2.35 (R Project for Statistical Computing [http://www.R-project.org]). Patients with missing data were excluded from the analyses. Absolute risk of events occurring by 3 years was determined from Kaplan-Meier analysis. Given that LSCs have been experimentally demonstrated to be resistant to chemotherapy,13,21 the LSC score was tested for associations with primary refractoriness to therapy and disease relapse (event-free and relapse-free survival). For relapse-free survival, only patients who had first achieved clinical remission from disease were included.22
The robustness of the association between the LSC score and outcomes was evaluated as follows. The training data set was split in half. Gene weightings defining the LSC score were derived in half of the data and then applied to the other half to test associations with survival. Results from 1000 random splits of the training set were compared. Furthermore, the uniqueness of the prognostic value of the LSC-enriched genes was tested by comparing it with the results obtained by repeating the analysis on 10 000 sets of the same number of randomly selected genes.
Prior clinical investigation in adult AML has defined several important prognostic factors including age, karyotype (chromosomal rearrangements), and molecular mutations, particularly internal tandem duplications in the FLT3 gene and mislocalizing mutations in the NPM1 gene.5,23 In the current analysis, multivariate Cox regression was used to test whether the LSC score conferred prognostic value independent from these established clinical predictors. Area under the receiver operating characteristic curves were computed using the SurvivalROC package version 1.0 (R Project for Statistical Computing) to further investigate how the LSC score added to known prognostic factors.24 Because assignments to cytogenetic risk groups were inconsistent between different clinical groups, risk was compared in a uniform fashion across data sets by applying the refined Medical Research Council risk scheme23 (favorable, intermediate, or adverse) based on metaphase karyotypes.
Because acute promyelocytic leukemia is a distinct disease entity, it was excluded from all survival analyses. Association of the LSC score with AML clinical subtypes also was assessed using analysis of variance followed by Games-Howell post hoc testing using SPSS software version 12 (SPSS Inc, Chicago, Illinois) for groups with unequal sizes and variances. All statistical tests were 2-sided. P values of less than .05 were considered significant.
Global gene expression profiles of 15 LSC-enriched subpopulations were compared with 15 paired LPC-enriched subpopulations collected from the same 15 AML samples. The samples were derived from patients representing a diversity of AML subtypes and clinical outcomes (eTable 1). Significance analysis of microarrays15 identified 31 genes as more highly expressed in LSC-enriched subpopulations than in LPC-enriched subpopulations. Twenty-one genes were identified as more highly expressed in LPC-enriched subpopulations than in LSC-enriched subpopulations (false discovery rate <10%; Figure 1 and eTable 2). Many of these genes were significantly associated with each other in a network-based analysis (eFigure 2). In addition to the CD34 and CD38 cell surface markers used to purify the samples, the group of genes included factors known to be differentially expressed in early hematopoiesis such as VNN1, RBPMS, SETBP1, GUCY1A3, and MEF2C. All gene symbols reported herein were assigned by the Human Genome Organization Gene Nomenclature Committee. The HOPX homeobox gene was found to be more highly expressed in LSC-enriched subpopulations and has known interactions with the induced pluripotency factors SOX2, POU5F1, NANOG, as well as HDAC2 (eFigure 3).
Gene set enrichment analysis14 showed that genes more highly expressed in LSCs were enriched for those expressed in normal cells positive for CD34 and negative for CD38, which include normal HSCs compared with normal cells positive for CD34 and CD38 progenitors (Figure 2 and eTable 3). Notably, genes more highly expressed in LSCs also were enriched for genes whose expression in AML has been correlated with high expression of the BAALC gene, which is an adverse prognostic factor.25 Conversely, proliferation, cell cycle, and differentiation-related genes were systematically repressed in the LSC-enriched subpopulations compared with more differentiated LPC-enriched subpopulations, which is consistent with a tendency for replicative quiescence.10
The 31 genes more highly expressed in LSCs were combined to generate an LSC signature (see “Methods” section). The LSC signature was computed in the purified subpopulations from primary AML patient samples and in subpopulations from the normal hierarchy of differentiating myeloid blood cells. By definition, the LSC signature was high in LSCs compared with LPCs, but also relative to downstream more differentiated AML blast cells negative for CD34 (Figure 3). Among normal samples from healthy individuals, the LSC signature was high in HSCs and in multipotent progenitors compared with more mature myeloid cell populations. These observations suggest that the LSC signature is shared with normal HSCs, implying that it may reflect self-renewal ability and relative proliferative quiescence.
We next evaluated whether expression of LSC-enriched genes was associated with clinical outcomes using 4 public data sets of bulk AML expression profiles with available clinical annotations. Details of patient characteristics, primary therapies, clinical responses, remission rates, and outcomes have been reported previously16-19,26 and are summarized in Table 1.
The LSC signature was calculated for a training set of 163 patients with AML lacking chromosomal abnormalities (normal karyotype AML).16 The weights were defined for combining expression levels of the 31 LSC genes into a single measure for each sample (eTable 4). Because the same gene weights were applied to the independent test cohorts, we refer to this as the LSC score. In the training set, the LSC median score was 24.9 (range, 17.4-33.1) and was associated with overall survival as a continuous variable (hazard ratio [HR], 1.15; 95% confidence interval [CI], 1.08-1.22; log-likelihood P <.001). Higher LSC score was associated with inferior outcome (Table 2). Stratification of patients into high vs low LSC score groups robustly separated survival curves (HR, 1.85 [95% CI, 1.25-2.74] log-rank P = .002; Figure 4). The absolute risk of an event by 3 years was 57% (95% CI, 43%-67%) in the low LSC score group compared with 78% (95% CI, 66%-86%) in the high LSC score group (HR, 1.9 [95% CI, 1.3-2.7]; log-rank P = .002). Association of the LSC-enriched genes with overall survival was supported by internal cross-validation in the training cohort (eFigure 4).
The LSC score was calculated for each sample in the 3 independent test cohorts. For the cases with normal karyotypes in these cohorts, high LSC score was associated with inferior overall survival as a continuous variable (Table 2). Using the median LSC score from the training set as a prespecified threshold, stratification of patients into high vs low LSC score groups significantly separated survival curves in each data set (Figure 4 and eFigure 5). For example, in patients with normal karyotypes from 1 well-characterized cohort of adult patients with diverse karyotype AML (primarily treated with induction regimens including cytarabine and anthracycline in Tomasson et al17), the LSC score ranged from 16.6 to 31.0 and was significantly associated with overall survival (Table 2, Figure 4, and Table 3).20 This association was significant whether the LSC score was evaluated as a continuous predictor (HR, 1.13 [95% CI, 1.04-1.22]; P = .003) or as a high vs low LSC score split (HR, 2.7 [95% CI, 1.4-5.1]; P = .002). Patients in the low LSC score group had a median overall survival of 56.3 months (absolute risk, 39%; 95% CI, 20%-54%) compared with 16.3 months (absolute risk, 81%; 95% CI, 61%-90%) for patients in the high LSC score group (Table 3). The set of genes comprising the LSC score was significant in its prognostic utility when compared with 10 000 randomly selected gene sets of the same size (eFigure 6), supporting the conclusion that the association with clinical outcomes was not a false-positive result.
The LSC scores ranged from 16.6 to 35.5 among all patients with non–acute promyelocytic leukemia, including those with cytogenetic abnormalities in the Tomasson et al17 cohort (Table 3), and were associated with overall survival as a continuous variable (HR, 1.10 [95% CI, 1.04-1.17]; P = .001). Patients in the low LSC score group had a median overall survival of 56.3 months (absolute risk, 45%; 95% CI, 30%-57%) compared with 16.5 months (absolute risk, 75%; 95% CI, 64%-83%) in the high LSC score group (HR, 2.0 [95% CI, 1.3-3.2]; P = .003). Investigation of the LSC score in patients from 2 additional cohorts that included patients with chromosomal abnormalities confirmed its association with adverse overall survival (eFigure 5 and Table 2).
Higher LSC scores were consistently associated with inferior event-free survival in patients with normal karyotype AML from 2 cohorts with available data (continuous variable HR, 1.15 [95% CI, 1.06-1.26] P = .001; and continuous variable HR, 1.11 [95% CI, 1.03-1.21] P = .007). The group with the highest LSC scores had a median event-free survival of 10 months (absolute risk, 81%; 95% CI, 60%-91%) compared with 48 months (absolute risk, 48%; 95% CI, 27%-63%) for the group with low LSC scores in the Tomasson et al17 cohort (HR, 2.4 [95% CI, 1.3-4.5]; P = .006) (Figure 4, Table 3, and Table 4). In the second cohort with available event-free survival data,19 the absolute risk for the low LSC score group was 61% (95% CI, 46%-72%) compared with 80% (95% CI, 64%-89%) for the high LSC score group (HR, 1.7 [95% CI, 1.1-2.7] P = .02; eFigure 5).
For the latter data set,19 LSC score also was associated with relapse-free survival in patients with normal karyotype AML who had achieved an initial clinical remission (continuous variable HR, 1.13 [95% CI, 1.01-1.27]; P = .03). The median relapse-free survival was 66 months (absolute risk, 43%; 95% CI 26%-56%) in the cases with low LSC scores compared with only 10 months (absolute risk, 68%; 95% CI, 46%-81%) in the cases with high LSC scores (HR, 1.8 [95% CI, 1.0-3.3] P < .06; eFigure 5). This finding is consistent with the demonstrated chemoresistance of LSCs.13
Concordantly, the rate of clinical remission was superior among patients with low LSC scores, both in an older cohort (median age of 65 years) with 56% clinical remission for the low LSC score group vs 29% for the high LSC score group (Fisher exact test P < .001)18 and in a younger cohort (median age of 43 years) with 88% clinical remission for the low LSC score group vs 76% for the high LSC score group (Fisher exact test P = .02).19,26 Furthermore, LSC scores were significantly higher in patients unable to achieve clinical remission compared with those who reached clinical remission (P < .001; Figure 5), a distinction most evident for those patients in whom such remissions were durable.
The LSC score carried prognostic value independent of other known clinical factors. In multivariate Cox regression including age, mutations in FLT3 and NPM1, and cytogenetic abnormalities, the HRs for LSC score in the 3 cohorts were 1.07 (95% CI, 1.01-1.13; P = .02), 1.10 (95% CI, 1.03-1.17; P = .005), and 1.17 (95% CI, 1.05-1.30; P = .005) (eTable 5).The LSC score was significant independent of these factors in multivariate Cox regression, with high LSC score again being associated with adverse overall survival, event-free survival, and relapse-free survival in all but 1 instance (eTable 5). Comparisons of area under the receiver operator characteristic curves (a measure of the accuracy of a prognostic model) showed that the LSC score added to the prognostic value of age, internal tandem duplications within the FLT3 gene, mislocalizing mutations of the NPM1 gene, and cytogenetic risk in predicting overall survival at 2 years in all cohorts for both normal karyotype AML and AML with chromosomal abnormalities (eTable 6). Models that incorporated the LSC score in addition to other known prognostic factors had consistently higher area under the receiver operator characteristic curves compared with models that did not include the LSC score (eSupplement).
In normal karyotype AML, higher LSC score was associated with inferior overall survival in cases with wild-type NPM1 and mutant NPM1 (despite the fact that the latter are frequently negative for CD34) and in patients with both wild-type FLT3 and wild-type NPM1 (eTable 7). Furthermore, similar results were obtained when all analyses (including derivation of LSC score gene weightings) were performed after excluding cases with cells negative for CD34 (defined either as having mutant NPM1 or as the 40% of samples with the lowest CD34 expression). Exclusion of cases with cells negative for CD34 from the model and validation resulted in an LSC score with similar gene weightings and prognostic value (eTable 4). Therefore, the LSC score was not simply a proxy for CD34 status. Taken together, these data indicate that higher LSC score is associated with inferior survival outcomes independent of age, internal tandem duplications within FLT3 or NPM1 mutations, CD34 expression, and cytogenetic risk group and adds to their prognostic utility.
Although similar across most age groups and morphological subtypes, LSC scores were higher in cases with minimally differentiated myeloblasts (French-American British M0), which typically have poor prognosis,28 consistent with previous reports of high LSC prevalence in this subtype (eFigure 7).29 In general, the LSC score was similar in favorable, intermediate, and adverse cytogenetic risk groups, and was not a direct proxy for this factor (eFigure 7C). This is consistent with our findings from survival analyses that showed that the LSC score confers an independent prognostic value. When considering specific cytogenetic subgroups, the LSC score had higher than average values in patients with unfavorable −5 or 7(q) abnormalities, and lower than average values among AML-harboring anomalies involving 11q23/ MLL. Recent studies have reported that self-renewing cells from AML mouse models carrying MLL anomalies reside in more mature cells.30
We also investigated the relationship of the LSC score with molecular mutations in normal karyotype AML, which is the largest single cytogenetic subgroup of AML. The LSC scores were significantly lower in those harboring mislocalizing mutations of the NPM1 gene (eFigure 7D and eFigure 8), in agreement with recent observations that leukemia initiating cells in NPM1 -mutant AML are frequently negative for CD34.31 Furthermore, LSC scores were significantly lower within the subgroup of patients with wild-type FLT3 and mislocalizing mutations of the NPM1 gene, which is a combination conferring a distinctly favorable prognosis in patients with normal karyotype AML (eFigure 7D).6 The LSC scores also were lower in patients with normal karyotype AML and double CEBPA mutations, which is again associated with favorable outcomes19 relative to cases with single mutations but not relative to wild-type CEBPA. Similar findings were observed in all 4 independent data sets totaling 1047 patients (eFigure 8 and eFigure 9). Of note, no significant differences in LSC scores were observed when patients with AML were stratified according to less common recurrent somatic mutations, including those in the tyrosine kinase domain of FLT3 or activating mutations in NRAS, KRAS, or IDH1.
Clinical evidence supporting the significance of the cancer stem cell model for human AML has been lacking despite ample experimental evidence from transplantation assays in immunocompromised mice. In this study, we show that a gene expression score associated with the LSC-enriched subpopulation is an independent prognostic factor in AML, with high LSC score associated with adverse outcomes in multiple independent cohorts. Specifically, high LSC score is associated with poor overall survival, event-free survival, and relapse-free survival in patients with normal karyotype AML and inferior overall survival in patients with chromosomal abnormalities. Additionally, the LSC score was associated with primary response to induction chemotherapy because high LSC scores strongly correlated with lower remission rates. Multivariate analysis demonstrated that high LSC score was associated with poor outcomes independently of age, presence of FLT3 or NPM1 mutations, and cytogenetic risk group. These findings support the clinical relevance of the cancer stem cell model for AML.
Stem cells of AML were originally identified by prospectively separating primary leukemic specimens into subpopulations based on expression of CD34 and CD38, which are surface markers that are differentially expressed in normal hematopoiesis (eFigure 1).10 When the function of these tumor subpopulations was assessed by transplantation into immunodeficient mice, leukemia-initiating activity was demonstrated exclusively in the fraction of cells positive for CD34 and negative for CD38.11 The majority of recent studies indicate that LSC activity for AML is enriched in the subpopulation of cells positive for CD34 and negative for CD38, although recent reports have challenged whether this is exclusive.31,32 The clinical significance of the LSC model is suggested by 2 prior studies, the first of which identified an inverse correlation between the frequency of cells positive for CD34 and negative for CD38 at diagnosis and the duration of relapse-free survival.33 The second study reported that the relative ability of AML cells to successfully engraft in immunodeficient mice (a property associated with LSC) correlated with adverse clinical features.34 While suggestive, neither of these studies investigated large cohorts of patients with long-term follow-up and diverse clinical features.
Notably, the LSC signature was highly expressed in purified HSC and was much less expressed in more differentiated myeloid progenitor cells, suggesting that it may be reflective of self-renewal ability. Despite the observed similarities between the LSC signature and HSC gene expression programs, therapeutic targeting of LSCs is still possible without toxicity toward normal HSC. Indeed, markers distinguishing LSC from HSC exist and are amenable to targeted therapies, including antibodies to CD47, CLL-1, and CD123.35-37 Future work is needed to prospectively validate the prognostic ability of the LSC score by evaluating its component genes using reverse transcription–polymerase chain reaction in an independent patient cohort. It will also be pertinent to examine the relationship of the LSC score to other gene expression signatures that have been proposed for predicting survival in patients with AML.16,26,38
In addition to the cell markers of CD34 and CD38 that were used for their purification, LSCs were distinguished from LPCs by the expression of several other genes known to be differentially expressed during early myelopoiesis. These included 3 members (GIMAP2, GIMAP6, and GIMAP7) of a small family of immunoassociated nucleotide–binding proteins implicated in survival of HSCs and leukemia.39 However, no prior associations with AML have been described. Two genes (HOPX and GUCY1A3) in this signature, which have previously been incorporated into AML prognostic models,16,38 are notable for their distinctive pattern of expression and histone modification in self-renewing cells.40HOPX is an unusual homeodomain protein known to directly recruit histone deacetylase activity without directly binding DNA.41 This gene also is directly repressed in vivo in malignant cells in response to administration of the histone deacetylase inhibitor panobinostat.42 The latter is currently being studied in clinical trials for patients with AML. GUCY1A3, which encodes a component of the soluble guanylate cyclase enzyme catalyzing the conversion of guanosine triphosphate to cyclic guanosine monophosphate, is repressed during replicative senescence.43 Cyclic guanosine monophosphate has been reported to stimulate HSC proliferation.44
The cancer stem cell model has been studied in solid tumors in addition to leukemia.7 Investigation of gene expression in human breast cancer stem cells identified a signature prognostic of metastasis-free survival and overall survival in multiple carcinomas, suggesting the clinical significance of the cancer stem cell model in these solid tumors.45 Among other human malignancies, we and others have described the prognostic significance of distinctive signatures of self-renewing populations, including embryonic stem cells,46,47 HSCs, and progenitor cells.48,49
Our study is the first to directly define a signature of enriched AML-initiating cells and to relate this signature to expression profiles of diagnostic specimens, allowing a link to corresponding clinical and pathological features of patients. Ultimately, this model has major implications for cancer therapy, most notably that in order to achieve cure, the cancer stem cells must be eliminated.7 To accomplish this in AML, novel therapies targeting LSC must be developed. Several such therapies are being investigated including small molecules21,50-52 and monoclonal antibodies,35,36,53 which hold promise for improving therapeutic efficacy beyond current conventional treatments.
Our LSC prognostic model requires validation in a prospective study of patients with AML treated with a standardized protocol incorporating strict eligibility criteria, a uniform treatment plan, uniform sample collection and handling, and well-defined primary end points. Moreover, microarrays are not broadly used in clinical decision making.54 While surrogate methods such as real-time polymerase chain reaction have demonstrated clinical utility,55 their application requires performance assessment in independent laboratories. Flow cytometric analysis of the predictive power of the proteins comprising the LSC score and comparison with RNA-based models may help to determine the best platform for future clinical application. However, the monoclonal antibodies useful for flow cytometry are not presently available for the full set of encoded proteins.
High expression of an LSC gene signature is independently associated with adverse outcomes in patients with AML. If prospectively validated, the described LSC score may be incorporated into routine clinical practice for predicting prognosis in patients with AML and used in clinical trials incorporating risk-based stratification or randomization strategies.
Corresponding Authors: Ash A. Alizadeh, MD, PhD, and Ravindra Majeti, MD, PhD, School of Medicine, Stanford University, Palo Alto, CA 94305 (firstname.lastname@example.org and email@example.com).
Author Contributions: Drs Gentles and Alizadeh had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Gentles, Plevritis, Majeti, Alizadeh.
Acquisition of data: Gentles, Majeti, Alizadeh.
Analysis and interpretation of data: Gentles, Plevritis, Majeti, Alizadeh.
Drafting of the manuscript: Gentles, Plevritis, Majeti, Alizadeh.
Critical revision of the manuscript for important intellectual content: Gentles, Plevritis, Majeti, Alizadeh.
Statistical analysis: Gentles, Plevritis, Alizadeh.
Obtained funding: Plevritis, Majeti.
Administrative, technical or material support: Plevritis, Majeti, Alizadeh.
Study supervision: Majeti, Alizadeh.
Financial Disclosures: A patent application for the development of the leukemic stem cell score as a diagnostic assay has been submitted by Drs Gentles, Majeti, and Alizadeh. Dr Plevritis did not report any financial disclosures.
Funding/Support: This research was supported by the Integrative Cancer Biology Program through National Institutes of Health awards 1U54CA149145 and U56-CA112973 (Dr Plevritis). Dr Majeti holds a Career Award for Medical Scientists from the Burroughs Wellcome Fund. Dr Alizadeh holds a Career Development Award from the Leukemia and Lymphoma Society.
Role of the Sponsors: The funding organizations had no role in the design and conduct of the study; in the collection, analysis, and interpretation of the data; or in the preparation, review, or approval of the manuscript.
Additional Contributions: We thank Irving Weissman, MD, and Ronald Levy, MD, for support and discussion and Rob Tibshirani, PhD, for advice on statistical analyses (all 3 from Stanford University, Palo Alto, California). We also thank the following Stanford University clinicians for critically reading the manuscript: Bruno Medeiros, MD, Jason Gotlib, MD, Linda Boxer, MD, PhD, and Beverly Mitchell, MD. We are indebted to the patients and their physicians, including the Stanford Hematology Tissue Bank, the German AML Cooperative Group, the Haemato-Oncology Foundation for Adults in the Netherlands, the Southwest Oncology Group, the German AML Study Group/Ulm, Washington University School of Medicine, and the Cancer and Leukemia Group B, for sharing raw data and clinical information from their studies. We also thank Fumihiko Ishikawa, MD, PhD, and Atsushi Hijikata, MSc, for providing access to their microarray data from purified AML subsets (RIKEN). No persons or groups acknowledged in this section received compensation.