Cytomorphology of initial bone marrow biopsy. A, Hematoxylin and eosin stain; original magnification ×100. B, Hematoxylin and eosin stain; original magnification ×600.
A, Dominant clone (46, XX, del(9)(q12q32), del(12)(q12q21), −6, −16, add(16)(p13.2), +2 mar [13 of 20 cells]). B, Minor clone (46, XX, del(9)(q12q32), del(12)(q12q21)[6 of 20 cells]). Pink boxes indicate chromosomal abnormalities.
Fluorescence in situ hybridization (FISH) performed using dual fusion, dual probes (Abbott/Vysis). A, One fusion signal, 2 red signals (chromosomes 15), and 1 green signal (chromosome 17). B, Fusion on der(17). Probes labeled with Spectrum Orange (PML) and Spectrum Green (RARA).
A and B, Schematic representation of ins(17;15) identified by whole-genome sequencing and resulting in PML-RARA fusion. Breakpoints (blue) are 72027045 and 72104113 base pairs (bp) (chromosome 15) and 35742679 and 35742683 bp (chromosome 17), using NCBI36/hg18 build. Consistent with other translocations associated with acute promyelocytic leukemia, the RARA breakpoint occurs before the DBD (DNA-binding domain) and preserves this entire domain. The fusion PML-RARA thus retains all the major protein domains of both proteins. ATG indicates coding sequence start; CC, coiled-coil domain; Kb, kilobase; ZF, zinc finger domain. Arrows indicate binding sites of primers used in Figure 5A.
A, Polymerase chain reaction (PCR) of genomic DNA from the case patient's skin (normal) and leukemia (tumor) using primers that span the junction of RARA-LOXL1 (P1/P2), PML-RARA (P3/P4), del(12), del(14), LOXL1-PML (P5/P6), and del(19). Note amplification across fusion breakpoints in the leukemia sample but not in the skin sample for all but del(19). DNA ladder: 2176, 1766, 1230, 1033, 653, 517, 453, 394, 298, 234, and 154 base-pairs (bp). B, RNA prepared from cryopreserved leukemia cells from case patient was amplified with a forward primer in PML exon 3 and a reverse primer in RARA exon 3. DNA ladder: 1500, 800, 500, 300, 200, 150, 100, and 50 bp.
Single-nucleotide variants (SNVs) that occur in protein coding sequences of the case patient in total bone marrow cells. Two clones are identified based on the 2 clusters of SNV frequency (clone 1: variant allele frequency, 35% to 51%; clone 2: variant allele frequency, 13% to 21% [see text]). This pattern is consistent with the cytogenetic findings of 2 genetically distinct clones (Figure 2).
Schematic representation of fosmid homology to the STROML1/PML locus and to the CDC6/RARA locus. ATG indicates coding sequence start; CC, coiled-coil domain; Kb, kilobase; ZF, zinc finger domain. bcr1, bcr2, and bcr3 indicate breakpoints in PML associated with different-sized PML-RARA fusion transcripts.
Fusion events in additional patients 1 (A and B) and 2 (C and D). A, One fusion signal, 2 red signals (chromosomes 15), and 1 green signal (chromosome 17). B, Fusion event on der(17), consistent with ins(17;15). C, One fusion signal, 2 green signals (chromosomes 17), and 1 red signal (chromosome 15). D, Fusion event on der(15). Probes labeled with Spectrum Orange (STROML1/PML) and Spectrum Green (CDC6/RARA).
Welch JS, Westervelt P, Ding L, Larson DE, Klco JM, Kulkarni S, Wallis J, Chen K, Payton JE, Fulton RS, Veizer J, Schmidt H, Vickery TL, Heath S, Watson MA, Tomasson MH, Link DC, Graubert TA, DiPersio JF, Mardis ER, Ley TJ, Wilson RK. Use of Whole-Genome Sequencing to Diagnose a Cryptic Fusion Oncogene. JAMA. 2011;305(15):1577-1584. doi:10.1001/jama.2011.497
Author Affiliations: Departments of Medicine (Drs Welch, Westervelt, Tomasson, Link, Graubert, DiPersio, and Ley and Ms Heath), Pathology and Immunology (Drs Klco, Kulkarni, Payton, and Watson), Genetics (Drs Kulkarni, Mardis, Ley, and Wilson), and Pediatrics (Dr Kulkarni), and Genome Institute (Drs Ding, Larson, Wallis, Chen, Watson, Mardis, Ley, and Wilson and Mr Fulton and Mss Veizer, Schmidt, and Vickery), Washington University, St Louis, Missouri.
Context Whole-genome sequencing is becoming increasingly available for research purposes, but it has not yet been routinely used for clinical diagnosis.
Objective To determine whether whole-genome sequencing can identify cryptic, actionable mutations in a clinically relevant time frame.
Design, Setting, and Patient We were referred a difficult diagnostic case of acute promyelocytic leukemia with no pathogenic X-RARA fusion identified by routine metaphase cytogenetics or interphase fluorescence in situ hybridization (FISH). The case patient was enrolled in an institutional review board–approved protocol, with consent specifically tailored to the implications of whole-genome sequencing. The protocol uses a “movable firewall” that maintains patient anonymity within the entire research team but allows the research team to communicate medically relevant information to the treating physician.
Main Outcome Measures Clinical relevance of whole-genome sequencing and time to communicate validated results to the treating physician.
Results Massively parallel paired-end sequencing allowed identification of a cytogenetically cryptic event: a 77-kilobase segment from chromosome 15 was inserted en bloc into the second intron of the RARA gene on chromosome 17, resulting in a classic bcr3 PML-RARA fusion gene. Reverse transcription polymerase chain reaction sequencing subsequently validated the expression of the fusion transcript. Novel FISH probes identified 2 additional cases of t(15;17)–negative acute promyelocytic leukemia that had cytogenetically invisible insertions. Whole-genome sequencing and validation were completed in 7 weeks and changed the treatment plan for the patient.
Conclusion Whole-genome sequencing can identify cytogenetically invisible oncogenes in a clinically relevant time frame.
Acute promyelocytic leukemia (APL) is commonly (>90%) associated with PML-RARA (NCBI Entrez Gene 5371 and 5914) fusion transcripts resulting from pathogenic t(15;17) translocations.1,2 Unusual cytogenetic rearrangements (eg, insertions and 3, 4, or even 8-way translocations)2- 4 can also lead to PML-RARA formation. Alternative PML-RARA fusions and splice variants exist but are not detected by standard reverse transcription polymerase chain reaction (RT-PCR)5- 7; alternative X-RARA fusions also may exist and may be responsive to all- trans retinoic acid (ATRA) (eg, NuMA1-RARA, NPM1-RARA, STAT5B-RARA, PRKAR1A-RARA, FIP1L1-RARA, BCOR-RARA, and the non- RARA translocation NUP98-RARG)1,8- 12 or ATRA resistant (PLZF-RARA).1 Timely and accurate diagnosis of APL is essential, because the addition of ATRA to chemotherapy leads to substantially improved outcomes (5-year event-free survival of 69%, compared with 29% in patients receiving chemotherapy alone).13
A 39-year-old woman with acute myeloid leukemia (AML) in first remission was referred to our institution for consideration of allogeneic stem cell transplantation. She had initially presented with hypofibrinogenemia, disseminated intravascular coagulopathy, and pancytopenia (white blood cell count, 1300/μL; hemoglobin, 11.6 g/dL; platelets, 72 ×103/μL). Her bone marrow contained 61% atypical promyelocytes with invaginated nuclei (including bilobed forms) and dense primary granules (Figure 1). She started induction chemotherapy with ATRA, cytarabine, and idarubicin. However, her metaphase cytogenetics (46, XX, del(9)(q12q32), del(12)(q12q21), −6, −16, add(16)(p13.2), +2 mar[13/20 cells]) (Figure 2) revealed a complex pattern, which is associated with less than 15% long-term survival and is treated with allogeneic transplantation during first remission whenever possible.14,15
Interphase fluorescence in situ hybridization (FISH) suggested a possible fusion between chromosomes 15 and 17 on der(17) but was most consistent with a RARA-PML fusion, not the pathogenic PML-RARA fusion characteristic of M3 AML (Figure 3). RT-PCR to detect a PML-RARA fusion transcript was not performed at the referring institution.
These findings led to a diagnostic conundrum, and ATRA was discontinued. Persistent AML was observed on day 14. The patient entered a complete remission following reinduction with cytarabine, idarubicin, and etoposide. She was then referred to our institution for consideration of allogeneic stem cell transplantation. At that time, her bone marrow biopsy revealed no morphologic evidence of AML and had normal metaphase cytogenetics, normal interphase FISH results, and no evidence of PML-RARA by RT-PCR. HLA typing identified 1 matched sibling. This case posed a diagnostic dilemma with prognostic and therapeutic consequences: does the patient have APL, or does she have AML with unfavorable-risk cytogenetics?
Because her leukemic cytomorphology was consistent with APL, we empirically recommended 2 cycles of arsenic trioxide consolidation, which she received.16
Little material from her original leukemia remained for subsequent evaluation, and no clinical samples were available for FISH or RT-PCR. However, 2 vials of bone marrow cells had been cryopreserved under a research protocol at her referring institution. DNA and RNA were generated from these respective samples (the RNA sample was severely degraded). We obtained appropriate consent for whole-genome sequencing and completed this analysis using paired-end reads. Our primary goal was to determine if whole-genome sequencing could identify an actionable mutation (eg, a cryptic X-RARA rearrangement) in a clinically relevant time frame (6-8 weeks).
A “movable-firewall” within our research protocol allows for the communication of clinically relevant findings to the patient's physician and to the patient while strictly maintaining patient anonymity among all research personnel. Deidentified samples are entered into a tissue bank. Clinical information (eg, age, sex, disease, treatment, outcome) is maintained in association with deidentified codes only. A list associating deidentified codes with personal patient information (name, date of birth, treating physician) is maintained in a locked safe; a single protocol administrator has access to this list, and the research team can communicate medically relevant information to the administrator. The administrator communicates this information to the treating physician, who is responsible for informing the patient of the whole-genome sequencing results and their clinical implications.
After obtaining explicit consent for whole-genome sequencing with an institutional review board–approved protocol, DNA libraries were generated from 1 cryovial of the original bone marrow aspirate and from a skin punch biopsy obtained during remission (matched normal cells). We generated 187.1 and 200.1 billion base pairs of DNA sequence from each of the respective samples, with an average read depth of 43.7× and 46.8×, respectively. Library generation, sequence production, and data analysis were performed as previously described.17- 20 Adequate genome-wide coverage (>99.5% diploid coverage) was ensured by assessing the coverage of known single-nucleotide polymorphisms in the patient's genome, as defined with data collected from the Affymetrix Genome-Wide Human SNP Array 6.0 (Affymetrix, Santa Clara, California).
All of the high-quality single-nucleotide variants (SNVs) found in tumor and skin samples from this patient are available in the database of genotypes and phenotypes (dbGaP) of the National Center for Biotechnology Information (phs000159.v3.p2).
Reagents and methods for PCR validation, RT-PCR, and FISH analyses, and the National Center for Biotechnology Information Entrez Gene identification numbers of all genes relevant to this article, are described in the eMethods.
Validated whole-genome sequencing results were completed and reported to the patient's physician 7 weeks after obtaining the DNA samples.
The timeline of data production, analysis, and validation was as follows: day 1, DNA samples logged in at Washington University Genome Institute; day 5, libraries completed and sequencing begun; day 18, sequence completed; day 22, alignment to reference sequence completed; day 24, prediction of SNVs completed; day 25, structural variants predicted by BreakDancer20; day 52, insertional fusion completely validated by PCR and results transmitted to the treating physician.
Using massively parallel DNA sequencing with paired-end reads, we identified 2 sets of breakpoints between chromosomes 15 and 17, which occur in the LOXL1/PML locus and RARA locus, respectively (schematically described in Figure 4). PCR amplification across each predicted breakpoint validated the en bloc insertion of a 77-kilobase (kb) segment of chromosome 15 (containing parts of the LOXL1 and PML genes) into intron 2 of RARA, the invariant site of RARA- associated translocations (Figure 5A).
This insertion generates 3 novel fusion transcripts: bcr3 PML-RARA, LOXL1-PML, and RARA-LOXL1. Expression of bcr3 PML-RARA was validated by RT-PCR using 3 different primer pairs (using the degraded RNA from the original banked AML sample), including the Clinical Laboratory Improvement Act (CLIA-CAP)–certified InVivoscribe PML/RARa Mix2b kit (Invivoscribe Technologies, San Diego, California) (Figure 5B, eFigure 1A, and data not shown). The RARA-LOXL1 fusion was out of frame and is predicted to encode a 67–amino acid protein (eFigure 1B and data not shown). The LOXL1-PML fusion leads to altered splicing and a premature stop codon prior to the PML junction and is predicted to encode a 573–amino acid protein (eFigure 1C).
In addition, we identified and resolved the breakpoints associated with all abnormalities observed with metaphase cytogenetics, including del(9), del(12), and add(16)(p32.2); the latter was in fact a translocation t(16;22)(p13.3;q13.31) (Figure 2, eFigure 2, and data not shown). Two other large deletions not found by conventional cytogenetics were detected by whole-genome sequencing: del(14) and del(19). The latter was also identified in the skin sample, proving that it is an inherited copy number variant (Figure 5A). The predicted deletions of chromosomes 6 and 16 were not detected by whole-genome sequencing. Instead, we identified a 61-megabase inv(6)(p22.3;q14.1) and a translocation t(6;16)(q22.31;p13.3) (Figure 2). We further identified and validated 12 SNVs within protein-coding sequences (Figure 6). SNV allele frequency was consistent with the presence of 2 distinct leukemic clones in the bone marrow, recapitulating the metaphase cytogenetics: 8 of 12 SNVs had a variant allele frequency of 35% to 51%, and 4 of 12 SNVs had a variant allele frequency of 13% to 21% (Figures 2 and 6). The significance of these somatic mutations for disease pathogenesis is currently unknown.
We designed a new set of fosmid-based FISH probes (each 30-40 kilobases in size) for the detection of insertional fusions that target the minimal PML translocation region (the promoter/enhancer and exons 1-3, which is roughly 30 kilobases) (Figure 7). We searched the Washington University Department of Pathology database for AML cases diagnosed during the last 5 years. We identified 11 cases with features suggestive of APL (including any promyelocytic morphology, characteristic CD33+CD34−HLA-DR− immunophenotype, and variable-to-strong myeloperoxidase staining by enzyme cytochemistry) but that lacked normal dual-fusion patterns by FISH. We found that 2 of these specimens contained PML-RARA fusions resulting from cryptic insertions: one was associated with an insertion of PML into the RARA locus (ins[17;15]) and the other with insertion of RARA into the PML locus (ins[15;17]) (Figure 8). Both cases (as well as the proband) had RT-PCR confirmation of a PML-RARA fusion (bcr1 and bcr3 isoforms), and all had features typical of APL (Table).
The usefulness of massively parallel DNA sequencing has improved considerably with the introduction of paired-end reads, which allow for better mapping efficiency and more accurate identification of junctional breakpoints associated with structural variants (translocations, insertions, and deletions).
In this report, we describe the use of paired-end–read whole-genome sequencing for real-time oncologic diagnosis and describe the genomic details of an oncogenic fusion gene created by an insertional event. Within 7 weeks, we completed the process of library generation, massively parallel sequencing, analysis, and validation of a novel insertional fusion that created a classic PML-RARA bcr3 variant. These findings altered the medical care of this patient, who received ATRA consolidation instead of an allogeneic stem cell transplant. The patient remains in first remission 15 months after her presentation.
RT-PCR was not performed at the referring institution, and results were negative when evaluated at Washington University when the patient was in remission. We did not initially perform RT-PCR with the RNA generated from the cryopreserved sample, because the RNA was severely degraded. Further, because this patient's complex cytogenetics predicted an unfavorable prognosis, it was essential to determine whether the patient had a recognized PML-RARA fusion gene (because t(15;17) supersedes other cytogenetic findings and predicts a favorable outcome in patients treated with ATRA). Fortunately, the sequencing of the patient's tumor genome resolved the conundrum, allowing the RT-PCR results to serve as confirmatory proof of the fusion event (despite RNA degradation) and providing a novel mechanism for its formation, which assured us that the diagnosis was correct.
Alternative laboratory approaches could have been used to detect potential pathogenic RARA rearrangements (eg, nested PCR from a linker-ligated library, long-distance PCR,21 BAC clone screening, targeted 3730 sequencing of the 48.5 kb RARA locus). However, such techniques are labor intensive and can require large amounts of starting material, personalized design, and iterative troubleshooting. Moreover, many of these techniques have success rates not adequate for clinical practice. In contrast, whole-genome sequencing with paired-end libraries can be accomplished with as little as 10 ng of starting DNA, is amenable to an automated “pipeline” strategy, and can consistently deconvolute SNVs, small insertions and deletions, structural variants, and clonality. Further, this approach requires no custom reagents and no foreknowledge of genomic regions that must be assessed for diagnostic accuracy.
This study also confirms that reciprocal RARA-PML fusions are not required for the development of APL. RARA-PML has been proposed to participate in APL pathogenesis, although it is identified in only approximately 67% of APL cases by RT-PCR.22 Complex rearrangements and deletions can lead to unusual RARA-PML transcripts of uncertain significance.22- 25 Furthermore, bcr3 RARA-PML does not independently lead to leukemia in a murine leukemia model.26,27 In this patient, the reciprocal fusion RARA-PML was absent, and the alternative RARA-LOXL1 was fused out of frame.
Whole-genome sequencing resolved not only the PML-RARA insertion event but also all other abnormalities observed during routine cytogenetic testing. Loss of chromosome 6 and 16 were not detected with whole-genome sequencing; rather, we identified an additional inversion and translocation, inv(6)(p22.3;q14.1) and t(6;16)(q22.31;p13.3). This suggests that genetic information on chromosomes 6 and 16 was actually present within the 2 marker chromosomes but that chromosomal banding patterns were disrupted by a 3-way translocation.
FISH has been used to suggest that insertional translocations may occur in t(15;17)–negative promyelocytic leukemia.2,22,28- 30 The commercially available dual-fusion dual-probe strategy (Abbott/Vysis, Abbott Park, Illinois) uses large probes (between 239 and 417 kb, schematically described in eFigure 3A and B). However, the minimal required PML insertion region (the promoter/enhancer and exons 1-3) is nearly one-tenth the size of these probes. The large probes (239+ kb) improve sensitivity for conventional t(15;17) detection, but they make accurate diagnosis of small insertional events difficult or even impossible.2,31 Alternative cosmid-based strategies improve the ability to detect small insertions,2 but the reagents are not widely available. Because of these issues, we designed a new set of fosmid-based probes, which are publicly available. Using these probes, we identified further cases of PML-RARA insertional fusions missed by conventional cytogenetics and FISH.
Insertional translocations are likely to be underdiagnosed because of the technical difficulties associated with FISH using conventional probes (as highlighted by this case). This may be especially true of tumor types interrogated with FISH break-apart strategies and tumors that lack an alternative molecular diagnostic assay (eg, Burkitt lymphoma).32
Diagnostic whole-genome sequencing remains cost-prohibitive for universal application in patients with cancer (approximately $40 000 for each tumor/normal pair at the current time). This price has been decreasing rapidly over the last several years, while the expansion of AML-associated genes and mutations (eg, PML-RARA, ETO-AML1, CBFB-MYH11, BCR-ABL, FLT3, NPM1, KRAS, DNMT3A, TET2, IDH1/2, RUNX1, CEBPA) is increasing the cumulative cost of molecular tests that may be relevant for diagnosis and risk-prediction.
Acute promyelocytic leukemia is likely to be an indicator of the variety of mutations that can present with similar morphology and clinical course. APL may be associated with diverse PML-RARA splice variants, unusual translocations/insertions, multiple RARA fusion partners, and even a RARG fusion partner.1,8- 12 The diagnostic strength of whole-genome sequencing is that it is a generic and stable platform for detection of mutations, and special approaches are not required for specific diagnostic settings. All classes of mutations are detected in a totally unbiased fashion, allowing for confirmation of a suspected diagnosis, even if caused by a rare or unusual mutation; these data can be obtained and interpreted in a clinically relevant time frame.
The time required to complete whole-genome sequencing is rapidly decreasing. The sequencing timeline for this patient involved 25 days to generate and analyze whole-genome sequencing and an additional 27 days for orthogonal validation; this makes validation a significant but necessary bottleneck in the overall time to complete clinical-grade sequencing (eFigure 4). As costs continue to decline, increasingly deep coverage will become possible (eg, 60× coverage instead of 30×), allowing for improved variant detection that may reduce the need for validation. With these improvements, clinical-grade whole-genome sequencing should soon be possible within 4 weeks of sample collection.
Meaningful diagnostic time frames are dependent on the cancer type being assessed. Some cancers (eg, follicular lymphoma and myelodysplastic syndrome) have long diagnostic windows before patients need therapy; others (eg, breast, lung, colon) have moderately long diagnostic windows (those in which definitive chemoradiotherapy often is not considered until after surgical resection and appropriate healing), and a few (eg, AML, acute lymphoblastic leukemia [ALL], Burkitt lymphoma, blast-phase chronic myelogenous leukemia, small-cell lung cancer) require urgent chemotherapy. The critical decision in the treatment in AML and ALL is not which induction therapy to use (because uniform approaches to remission-induction are currently used for both diseases) but whether patients should receive consolidation therapy with chemotherapeutic approaches or should receive allogeneic transplantation. This decision is generally made within 6 to 8 weeks of initial presentation; for AML and ALL, a 6-week time frame for whole-genome sequencing is therefore clinically relevant. However, to fully use this potentially transformative technology to make informed clinical decisions, standards will have to be developed that allow for CLIA-CAP certification of whole-genome sequencing and for direct reporting of relevant results to treating physicians.
Corresponding Author: Richard K. Wilson, PhD, Genome Institute, Washington University School of Medicine, 4444 Forest Park Blvd, PO Box 8501, St Louis, MO 63108 (email@example.com).
Author Contributions: Drs Ding and Larson had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Welch, Westervelt, Ding, Larson, Klco, Watson, Tomasson, Link, Graubert, DiPersio, Mardis, Ley, Wilson.
Acquisition of data: Klco, Kulkarni, Fulton, Veizer, Vickery, Heath, Watson, Wilson.
Analysis and interpretation of data: Welch, Westervelt, Ding, Larson, Klco, Kulkarni, Wallis, Chen, Payton, Fulton, Veizer, Schmidt, Vickery, Mardis, Ley, Wilson.
Drafting of the manuscript: Welch, Westervelt, Ley.
Critical revision of the manuscript for important intellectual content: Welch, Westervelt, Ding, Larson, Klco, Kulkarni, Wallis, Chen, Payton, Fulton, Veizer, Schmidt, Vickery, Heath, Watson, Tomasson, Link, Graubert, DiPersio, Mardis, Ley, Wilson.
Statistical analysis: Ding, Larson, Wallis, Chen, Schmidt, Vickery.
Obtained funding: Ley, Wilson.
Administrative, technical, or material support: Welch, Klco, Fulton, Vickery, Heath, Watson, Graubert, Ley, Wilson.
Sequence analysis: Ding, Larson, Wallis, Chen, Schmidt, Vickery.
Drs Welch and Westervelt contributed equally to this work.
Conflict of Interest Disclosures: All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Dr Westervelt reported receiving lecture fees from Celgene and Novartis; Dr DiPersio reported receiving consulting and lecture fees from Genzyme. No other authors reported disclosures.
Funding/Support: This study was supported by the Washington University Cancer Genome Initiative, by National Institutes of Health (NIH) grant PO1 CA101937, by Barnes-Jewish Hospital Foundation grant 00335-0505-02 (Dr Ley), and by NIH grant U54 HG003079 (Dr Wilson). Dr Welch is a fellow of the Leukemia Lymphoma Society.
Role of the Sponsor: The funding organizations had no role in design and conduct of the study; the collection, management, analysis, and interpretation of the data; or the preparation, review, or approval of the manuscript.
Additional Contributions: We thank the Washington University Cancer Genomics Initiative for its support and Charles W. Caldwell, MD, PhD, and Carl Freter, MD (both at University of Missouri-Columbia School of Medicine), for referring the patient and for collecting, storing, and contributing the cryopreserved bone marrow cells used for this study. Neither of these individuals received compensation for their contributions.