[Skip to Content]
[Skip to Content Landing]
Download PDF
Figure 1.
Scatterplot matrix and linear regression analysis for intragroup and intergroup heterogeneity of normal tissue and tumor tissue. Representative scatterplots were generated to graphically display the linearity of pairwise relationships within normal tissue groups (A), tumor tissue groups (B), and normal vs tumor tissues (C) by total gene expression profiles for each tissue set. The Pearson correlation coefficient values are indicated by R2.

Scatterplot matrix and linear regression analysis for intragroup and intergroup heterogeneity of normal tissue and tumor tissue. Representative scatterplots were generated to graphically display the linearity of pairwise relationships within normal tissue groups (A), tumor tissue groups (B), and normal vs tumor tissues (C) by total gene expression profiles for each tissue set. The Pearson correlation coefficient values are indicated by R2.

Figure 2.
Hierarchical clustering of normal and tumor tissue based on the complete panel of 12 000 genes. Data sets are presented in a matrix format: each row represents a single transcript and each column an experimental tissue sample. The upper region of the matrix (left panel) was enlarged for viewing. The dendrogram represents similarities in the expression patterns among the experimental samples. The normal tissue (N) and tumor tissue (T) are followed by their corresponding case numbers.

Hierarchical clustering of normal and tumor tissue based on the complete panel of 12 000 genes. Data sets are presented in a matrix format: each row represents a single transcript and each column an experimental tissue sample. The upper region of the matrix (left panel) was enlarged for viewing. The dendrogram represents similarities in the expression patterns among the experimental samples. The normal tissue (N) and tumor tissue (T) are followed by their corresponding case numbers.

Figure 3.
Coupling bayesian analysis with hierarchical tissue clustering. The average link hierarchical clustering algorithm was performed on 1239 differentially expressed genes with P<.05. Data are presented in a matrix format: each row represents a single transcript and each column an experimental tissue sample. The upper region of the matrix (left panel) was enlarged for viewing. The normal tissue (N) and tumor tissue (T) are followed by their corresponding case numbers.

Coupling bayesian analysis with hierarchical tissue clustering. The average link hierarchical clustering algorithm was performed on 1239 differentially expressed genes with P<.05. Data are presented in a matrix format: each row represents a single transcript and each column an experimental tissue sample. The upper region of the matrix (left panel) was enlarged for viewing. The normal tissue (N) and tumor tissue (T) are followed by their corresponding case numbers.

Table 1. 
Percentage of Genes Detected in Normal and Tumor Tissues
Percentage of Genes Detected in Normal and Tumor Tissues
Table 2. 
Patient Characteristics
Patient Characteristics
Table 3a. 
Gene Descriptions
Gene Descriptions
Table 3b. 
Gene Descriptions
Gene Descriptions
Table 3c. 
Gene Descriptions
Gene Descriptions
Table 3d. 
Gene Descriptions
Gene Descriptions
Table 4. 
Gene Expression Levels of Head and Neck Squamous Cell Carcinoma–Specific Genes*
Gene Expression Levels of Head and Neck Squamous Cell Carcinoma–Specific Genes*
1.
Kinzler  KWVogelstein  B Lessons from hereditary colorectal cancer. Cell.1996;87:159-170.
PubMed
2.
Vogelstein  BKinzler  KW The multistep nature of cancer. Trends Genet.1993;9:138-141.
PubMed
3.
Strausberg  RL The Cancer Genome Anatomy Project: new resources for reading the molecular signatures of cancer. J Pathol.2001;195:31-40.
PubMed
4.
Strausberg  RLBuetow  KHEmmert-Buck  MRKlausner  RD The Cancer Genome Anatomy Project: building an annotated gene index. Trends Genet.2000;16:103-106.
PubMed
5.
Liang  PPardee  AB Differential display of eukaryotic messenger RNA by means of the polymerase chain reaction. Science.1992;257:967-971.
PubMed
6.
Velculescu  VEZhang  LVogelstein  BKinzler  KW Serial analysis of gene expression. Science.1995;270:484-487.
PubMed
7.
Lockhart  DJDong  HByrne  MC  et al Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol.1996;14:1675-1680.
PubMed
8.
Villaret  DBWang  TDillon  D  et al Identification of genes overexpressed in head and neck squamous cell carcinoma using a combination of complementary DNA subtraction and microarray analysis. Laryngoscope.2000;110(3 pt 1):374-381.
PubMed
9.
Leethanakul  CPatel  VGillespie  J  et al Distinct pattern of expression of differentiation and growth-related genes in squamous cell carcinomas of the head and neck revealed by the use of laser capture microdissection and cDNA arrays. Oncogene.2000;19:3220-3224.
PubMed
10.
Hanna  EShrieve  DCRatanatharathorn  V  et al A novel alternative approach for prediction of radiation response of squamous cell carcinoma of head and neck. Cancer Res.2001;61:2376-2380.
PubMed
11.
Alevizos  IMahadevappa  MZhang  X  et al Oral cancer in vivo gene expression profiling assisted by laser capture microdissection and microarray analysis. Oncogene.2001;20:6196-6204.
PubMed
12.
Long  ADMangalam  HJChan  BYTolleri  LHatfield  GWBaldi  P Improved statistical inference from DNA microarray data using analysis of variance and a Bayesian statistical framework: analysis of global gene expression in Escherichia coli K12. J Biol Chem.2001;276:19937-19944.
PubMed
13.
Sokal  RRSheath  PHA Principles of Numerical Taxonomy.  San Francisco, Calif: WH Freeman; 1963.
14.
Eisen  MBSpellman  PTBrown  POBotstein  D Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A.1998;95:14863-14868.
PubMed
15.
Eisen  M Cluster version 2.11 and TreeView version 1.50 [computer program].  Available at: http://rana.lbl.gov/EisenSoftware.htm. Accessed February 2, 2002.
16.
Golub  TRSlonim  DKTamayo  P  et al Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science.1999;286:531-537.
PubMed
17.
Alon  UBarkai  NNotterman  DAGish  KMack  YDLevine  AJ Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci U S A.1999;96:6745-6750.
PubMed
18.
Sgroi  DCTeng  SRobinson  GLeVangie  RHudson  JRElkahloun  AG In vivo gene expression profile analysis of human breast cancer progression. Cancer Res.1999;59:5656-5661.
PubMed
19.
Alizadeh  AAEisen  MBDavis  RE  et al Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature.2000;403:503-511.
PubMed
20.
Perou  CMSorlie  TEisen  MB  et al Molecular portraits of human breast tumours. Nature.2000;406:747-752.
PubMed
21.
van ‘t Veer  LJDai  Hvan de Vijver  MJ  et al Gene expression profiling predicts clinical outcome of breast cancer. Nature.2002;415:530-536.
PubMed
22.
Arfin  SMLong  ADIto  ET  et al Global gene expression profiling in Escherichia coli K12: the effects of integration host factor. J Biol Chem.2000;275:29672-29684.
PubMed
23.
Hanahan  DWeinberg  RA The hallmarks of cancer. Cell.2000;100:57-70.
PubMed
24.
Koch  WM Clinical implications of biomarkers in head and neck cancer. Curr Oncol Rep.1999;1:129-137.
PubMed
25.
Smith  BDHaffty  BGSasaki  CT Molecular markers in head and neck squamous cell carcinoma: their biological function and prognostic significance. Ann Otol Rhinol Laryngol.2001;110:221-228.
PubMed
26.
Smith  BDSmith  GLCarter  D  et al Molecular marker expression in oral and oropharyngeal squamous cell carcinoma. Arch Otolaryngol Head Neck Surg.2001;127:780-785.
PubMed
27.
Helliwell  TR Molecular markers of metastasis in squamous carcinomas. J Pathol.2001;194:289-293.
PubMed
28.
Forastiere  AKoch  WTrotti  ASidransky  D Head and neck cancer. N Engl J Med.2001;345:1890-1900.
PubMed
29.
Zhu  HCong  JPMamtora  GGingeras  TShenk  T Cellular gene expression altered by human cytomegalovirus: global monitoring with oligonucleotide arrays. Proc Natl Acad Sci U S A.1998;95:14470-14475.
PubMed
30.
Fischer  HStenling  RRubio  CLindblom  A Colorectal carcinogenesis is associated with stromal expression of COL11A1 and COL5A2. Carcinogenesis.2001;22:875-878.
PubMed
31.
Nagase  TSeki  NIshikawa  KTanaka  ANomura  N Prediction of the coding sequences of unidentified human genes, V: the coding sequences of 40 new genes (KIAA0161-KIAA0200) deduced by analysis of cDNA clones from human cell line KG-1. DNA Res.1996;3:17-24.
PubMed
32.
Sugita  MGeraci  MGao  B  et al Combined use of oligonucleotide and tissue microarrays identifies cancer/testis antigens as biomarkers in lung carcinoma. Cancer Res.2002;62:3971-3979.
PubMed
33.
Villaret  DBWang  TDillon  D  et al Identification of genes overexpressed in HNSCC using a combination of complementary DNA subtraction and microarray analysis. Laryngoscope.2000;110:374-381.
PubMed
34.
Belbin  TSingh  BBerber  I  et al Molecular classification of HNSCC using cDNA microarrays. Cancer Res.2002;62:1184-1190.
PubMed
35.
Al Moustafa  AAlaoui-Jamali  MBatist  G  et al Identification of genes associated with head and neck carcinogenesis by cDNA microarray comparison between matched primary normal epithelial and squamous carcinoma cells. Oncogene.2002;21:2634-2640.
PubMed
36.
Bonner  RFEmmert-Buck  MCole  K  et al Laser capture microdissection: molecular analysis of tissue. Science.1997;278:1481, 1483.
PubMed
37.
Sugiyama  YSugiyama  KHirai  YAkiyama  FHasumi  K Microdissection is essential for gene expression profiling of clinically resected cancer tissues. Am J Clin Pathol.2002;117:109-116.
PubMed
Original Article
July 2003

Tissue-Specific Gene Expression of Head and Neck Squamous Cell Carcinoma In Vivo by Complementary DNA Microarray Analysis

Author Affiliations

From the Division of Head and Neck Surgery and Oncology, Department of Otolaryngology, New York University School of Medicine, New York (Drs Sok, Kuriakose, Pearlman, DeLacure, and Chen); and Department of Ophthalmology, The Jules Stein Eye Institute, University of California, Los Angeles (Dr Mahajan). The authors have no relevant financial interest in this article.

Arch Otolaryngol Head Neck Surg. 2003;129(7):760-770. doi:10.1001/archotol.129.7.760
Abstract

Objectives  To identify distinct gene expression profiles of human head and neck squamous cell carcinomas (HNSCCAs) using complementary DNA (cDNA) microarray analysis and to create a preliminary, comprehensive database of HNSCCA gene expression.

Patients and Methods  Nine patients with histologically confirmed HNSCCAs, staged according to the American Joint Committee on Cancer, were enrolled. The HNSCCA tumor tissue and normal mucosal tissue were harvested at the time of surgery. A cDNA library was constructed from the paired fresh-frozen human surgical specimens of HNSCCAs and nonmalignant epithelial tissues. Biotinylated RNA was transcribed from the cDNA library and hybridized to high-density microarrays containing approximately 12 000 human genes. Altered gene expression of HNSCCAs was identified by comparison to corresponding normal mucosal tissues after a bayesian statistical analysis of variance. Results were analyzed using the gene database of the National Institutes of Health. Hierarchical clustering of the genomic data sets was determined by similarity metrics based on Pearson correlation.

Results  Hierarchical clustering analysis revealed that the gene expression profiles obtained from the nonselected panel of 12 000 genes could distinguish the tumors from nonmalignant tissues. Gene expression changes were reproducibly observed in 227 genes representing previously identified chemokines, tumor suppressors, differentiation markers, matrix molecules, membrane receptors, and transcription factors that correlated with neoplasia, including 46 previously uncharacterized genes. Moreover, significant expression of the collagen type XI α1 gene and a novel gene was reproducibly observed in all 9 tumors, whereas these genes were virtually undetectable in their corresponding, adjacent nonmalignant tissues.

Conclusions  Complementary DNA microarray analysis of human HNSCCAs has produced a preliminary, comprehensive database of tumor-specific gene expression profiles and provided important insights into modeling gene expression changes implicated in carcinogenesis. A large-scale analysis of gene expression carries the future potential of identifying sensitive molecular markers for early tumor detection, prognosis, and novel targets for interceptive therapeutics.

AN OVERWHELMING body of evidence suggests that the phenotypic diversity of head and neck squamous cell carcinomas (HNSCCAs) is preceded or accompanied by corresponding genotypic changes. According to an established model by Kinzler and Vogelstein,1,2 the development of neoplasia is a multistep process involving the accumulation of at least 7 genetic events for its progression into malignancy, primarily affecting various tumor suppressor genes and signaling tyrosine kinases.2 Such modified genomes presumably lead to aberrations in the normal transcriptional program, resulting in the substantial deregulation of a battery of genes. Thus far, the Cancer Genome Anatomy Project index of tumor genes has already classified more than 40 000 genes that are directly or indirectly active in 1 or more cancers.3,4

Nevertheless, systematic investigations into this complex molecular circuitry of carcinogenesis has been hampered by conventional techniques such as RNA Northern blot hybridization and ribonuclease protection assays that are both tedious and time-consuming and are typically performed by screening one gene at a time. Somewhat more sophisticated methods, such as differential display5 and Serial Analysis of Gene Expression,6 have been used to screen larger number of expressed genes. However, technical limitations render these techniques nonconducive to a comprehensive scale genomic survey.

Recently, technological advances have resulted in the emergence of an innovative technique based on the principle of complementary DNA (cDNA) library hybridization to high-density cartesian coordinate arrays containing ordered oligonucleotides of defined genes and expression sequence tags (ESTs), so-called microarrays or gene chips.7 Such an array platform permits simultaneous monitoring of thousands of gene expression events in a single experiment, providing the researcher with a new arsenal to analyze underlying pathomechanisms on a genome-wide scale. Although this approach has been applied to solid tumors of the head and neck, prior studies were limited by the paucity of genes loaded on the array (590-1190 genes),810 had technical limitations due to the lower sensitivity of the nylon-based arrays,9,10 or had statistical limitations due to sample size or inter-sample variation.911 As such, comprehensive gene expression profiles that offer prognostic value or the identification of tumor marker genes that are exclusively expressed in HNSCCA tissues still remain elusive.

In this study, we demonstrate the application of the powerful 12 000-gene, photolithographic microarray based on gene transcripts hybridized from 9 human surgical specimens of HNSCCA and corresponding nonmalignant epithelial tissues. The resulting data revealed intriguing insights into the biological pathways of upper aerodigestive tract malignancy. Hierarchical clustering analysis revealed that the gene expression profiles obtained from the raw panel of 12 000 genes alone were sufficient to distinguish most tumors from nonmalignant tissues. Moreover, significant expression patterns were observed with 2 genes that are reproducibly expressed in all 9 tumors, whereas these 2 genes were virtually undetectable in normal epithelial tissue. Using novel statistical software, we developed a model for genes involved in tumorigenesis with the hope that this approach may eventually identify specific therapeutic avenues for prevention, prognosis, and treatment of HNSCCAs.

METHODS
TISSUE COLLECTION AND RNA ISOLATION

Surgical biopsy specimens were immediately stabilized by freezing the tissue in liquid nitrogen in the operating room. For RNA extraction, tissues were cut to approximately 1 cm3, suspended in TRIzol reagent (tissue:TRIzol = 1:15, wt/vol) RNA extraction buffer, then immediately subjected to homogenization by Polytron handheld homogenizer (model PT1200C; Brinkmann Instruments, Westbury, NY) at a maximum setting of 30 seconds. Total RNA was purified according to published TRIzol extraction protocols (QIAGEN, Valencia, Calif). After the ethanol precipitation step in the TRIzol extraction procedure, an additional cleanup step using an isolation kit (RN-easy Total RNA; QIAGEN) was incorporated for better yields of labeled complementary RNA (cRNA) during the in vitro transcription-labeling reaction. The RNA concentration was determined by spectroscopy, and the quality was confirmed by agarose gel electrophoresis in 3-(N-morpholino)propanesulfonic acid (MOPS) buffer, according to well-established protocols.

SYNTHESIS OF cDNA FROM TOTAL RNA AND cRNA PREPARATION

Double-stranded cDNA was synthesized from total RNA using a custom synthesis kit (SuperScript II Double-Stranded cDNA Synthesis Kit; Life Technologies, Carlsbad, Calif). Following ethanol extraction, in vitro transcription reactions were performed with a labeling kit (BioArray HighYield RNA Transcript Labeling Kit; Enzo, New York, NY) according to the manufacturer's protocol. Purified, labeled cRNA was quantified by spectrophotometric analysis, qualitatively analyzed for size distribution by gel electrophoresis, and fragmented to 30 to 60 base fragments with Tris-acetate (pH 8.1, 40mM), potassium acetate (100mM), and magnesium acetate (30mM) in a 20-µL volume heated for 35 minutes to 94°C.

OLIGONUCLEOTIDE ARRAY HYBRIDIZATION

U95A human microarrays (GeneChip; Affymetrix, Santa Clara, Calif) were prehybridized with hybridization buffer (1.0M sodium chloride, 100mM 4-morpholinoethane sulfonic acid (MES) [pH 6.6], 20mM EDTA, and 0.01% Tween 20) at 45°C for 10 minutes with rotation. Fragmented cRNA (15 µg for standard GeneChip arrays, 5 µg for test chip), sonicated herring sperm DNA (0.1 mg/mL final), GeneChip Eukaryotic Hybridization Controls (Affymetrix), and acetylated bovine serum albumin (BSA) (0.5 mg/mL) were then added to the hybridization buffer and incubated with the arrays for 16 hours at 45°C after which the probe array was washed (Fluidics station; Affymetrix) and amplified with staining solution (100mM MES, 0.1M sodium chloride, 0.01% Tween 20, and 0.005% antifoam) containing 10 µg/mL of streptavidin-phycoerythrin (Molecular Probes, Eugene, Ore). Acetylated BSA (2 µg/µL final) was applied for 10 minutes at 25°C. After 2 washes, the signal was amplified with a second staining solution (100mM MES, 0.1M sodium chloride, 0.01% Tween 20, and 0.005% antifoam) containing 3 µg/mL of anti–streptavidin biotinylated goat antibody (Vector Laboratories, Burlingame, Calif), normal goat IgG (0.1 mg/mL final), and acetylated BSA (2 µg/µL final) for 10 minutes at 25°C. Additional streptavidin-phycoerythrin staining at 25°C and a final wash were performed.

DATA ANALYSIS

The resulting arrays were scanned twice in the GeneChip system confocal scanner (Hewlett-Packard GeneArray Scanner; Hewlett Packard Co, Palo Alto, Calif). To obtain average difference intensities, data analysis was first performed using computer software (GeneChip Expression Analysis Software version 3.3; Affymetrix). The Cyber-T software package (available from the Internet at http://www.igb.uci.edu/servers/cybert/) was then used to determine average intensity values of genes across 9 independent patient sets and to compare values for control and tumor specimens. This list was further restricted based on whether genes were consistently present or consistently absent in the 9 cases. Using a bayesian statistical analysis, significant changes were ranked by the assignment of P values.12 Genes displaying intensities below 30 were empirically considered absent. Fold changes in gene expression were calculated by comparing average intensities in normal tissue with that of HNSCCA lesions. For genes that were entirely absent in one subclass of specimens but present in the others, relative fold changes were assigned by comparing intensities to a normalized value determined by the Affymetrix software. Hierarchical clustering of the genomic data sets was determined by similarity metrics based on the Pearson correlation equation.13,14 Implementation of the clustering algorithms and dendrogram graphics was performed with the combined Cluster and TreeView15 software package under uncentered correlation and average linkage clustering settings.

RESULTS

The U95A human gene microarray is composed of oligonucleotide probe sets representing approximately 12 000 full-length, partially sequenced EST genes, including transcripts representing constitutively expressed housekeeping genes for internal control and normalization among samples. Of the sequences represented, approximately 5500 genes (45%) were detected in cRNA produced from snap frozen biopsy specimens of both HNSCCA and control epithelial tissues (Table 1), indicating abundant representation.11 The amplified cRNA generated no nonspecific or unusual hybridization patterns and had no appreciable degradation with respect to its 5′ region. The transcripts that represent the internal control genes were detected in abundance, suggesting that the integrity of the tissue RNA was of high quality (data not shown).

ANALYSIS OF TISSUE HETEROGENEITY AND PHYSIOLOGIC VARIATION

In contrast to the clonal nature of tissue culture experiments, investigations of gross biopsy specimens in vivo are often confounded by factors such as patient-to-patient variation, the proximity of "normal" control tissue to disease sites, and overall heterogeneity of cell populations, all of which may affect gene expression profiles. To measure the extent of heterogeneity and physiologic variation among samples, pairwise comparisons of total gene expression levels of all 18 biopsy specimens were performed based on the Pearson correlation coefficient (R2 = [n(Σxy) − (Σx)(Σy)]/{[n(Σx)2 − (Σx)2][n(Σy)2 − (Σy)2]}). Scatterplots of normal tissue vs normal tissue, tumor tissue vs tumor tissue, and normal tissue vs tumor tissue displayed overall correlation coefficients of 0.76, 0.68, and 0.64, respectively (Figure 1). Although these values are modest by clonal standards, the regression curves and R2 values consistently indicate that gene expression variability is greatest between normal and tumor tissues, followed closely by the overall variability observed within the tumor group, whereas the normal tissues independently displayed the least variability among samples. This intriguing result suggests that the gene expression patterns between normal tissues and tumors are most divergent and that the magnitude of those differences may be sufficient to uncover tumor-related genes from "biological noise" and the inherent genetic variability among patient-to-patient samples.

A natural basis for organizing gene expression data is to group genes with similar patterns of expression and to decipher common distinguishing features of the cancer tissue. The average linkage hierarchical clustering algorithm was performed to group genes on the basis of similarity in the pattern with which their expression varied throughout all samples. The same unbiased approach was used to group the 18 experimental tissues (from 9 tumors and 9 controls) on the basis of overall similarity in their gene expression profiles. We focused first on the nonexcluded panel of the entire 12 000 gene set. The input data were based on the average difference intensity gene expression value as determined by Affymetrix software, and the relationships among samples were summarized in a dendrogram (Figure 2). The results reveal that most normal tissues (7/9) group together and, likewise, most tumor tissues (6/9) cosegregated into a separate cluster, suggesting different gene expression patterns between the 2 major groups. Strikingly, the 2 distinct tumors that clustered furthermost from the common tumor group represented the only 2 cases with a prior history of HNSCCA among this series (Table 2). Although previous microarray studies have reported gene-based tissue segregation that distinguishes various pathophysiologic or histopathologic phenotypes from control tissue,1621 its discriminatory power was based on a preselected pool of differentially expressed genes. Our study suggests that the analysis of an unbiased panel of 12 000 genes alone may be sufficient for the detection of the major subtypes of HNSCCA tissue.

IDENTIFICATION OF TRANSCRIPTS DIFFERENTIALLY EXPRESSED IN HNSCCAs

A common approach to identifying significant gene expression changes is based on selection by fold change relative to the reference tissue. In our study, a pairwise analysis of all 9 tumors with corresponding control tissues revealed an average of 1790 genes that displayed a greater than 3-fold expression change (data not shown). However, many of these observations were changes with high interexperimental variance and likely represented artifacts. An alternative approach is based on applying the t test within a bayesian statistical framework and calculating P values rather than fold changes to rank significant expression changes. This method was preferred, since it more conservatively identifies genes with expression changes and has been shown to reduce false-positive results in microarray studies with limited number of replicates.12,22 In short, the bayesian algorithm generates rank order lists of genes that are most reproducibly different among groups.

After implementation of the combined t test–bayesian statistical analysis, significant changes in messenger RNA levels were detected in 615 genes (P<.01). Further increasing the stringency criteria in the selection of genes to those with P<.001 generated a register of 227 genes (Table 3). Moreover, 30 genes were identified with P<.00001 (Table 3). The molecular taxonomy of the 227 gene data set revealed a substantially enriched ratio of genes encoding extracellular proteins (25.1%), followed by cytosolic genes (22.5%), membrane-associated genes (17.6%), nuclear genes (13.7%), and mitochondrial genes (1.3%). In addition, we identified more than 30 previously characterized oncogenes (Table 3) and 45 HNSCCA-associated sequences (19.8%) that represent novel genes or ESTs.

According to the holistic model of cancer by Hanahan and Weinberg,23 a malignant cell has to essentially acquire 6 biological alterations to dictate pathogenesis, namely self-sufficiency in proliferative growth signals, insensitivity to growth inhibitory signals, evasion of apoptosis, limitless replicative potential, sustained angiogenesis, and the induction of invasion. In support of this model, our study shows that a considerable number of the 227 HNSCCA-related genes can directly or indirectly feed into any number of these pathways (Table 3). Two prominent features of our data were the high representation of extracellular genes, most of which encode matrix proteins (56%), and a global down-regulation of extracellular genes involved in inflammation. Although the pathophysiologic role of such extracellular genes in HNSCCA is largely unknown, it is intriguing to speculate that this cluster of genes may represent novel molecular signatures specific for squamous cancers.2428

COUPLING BAYESIAN ANALYSIS WITH HIERARCHICAL TISSUE CLUSTERING

To test the efficacy of the bayesian statistical framework, hierarchical clustering was performed based on 1239 differentially expressed genes gated at P<.05 (Figure 3). Rather than merely recapitulating the affirmative trends seen in Figure 2, the combined algorithm resulted in a near-perfect clustering of tumor (8/9) and normal (8/9) tissues. The reproducibility between the 2 experiments was clearly evident. In both experiments, the tumors farthest from the common tumor group represented cases with prior histories of HNSCCA. Altogether, these improved results under bayesian analysis suggest that a highly precise tumor classification system can be generated by genes selected from a pool of less than 25% of the original sequences sampled.

COMMENT

Studies demonstrate that gene detection by microarrays correlate with standard methods such as Northern blot hybridizations, reverse transcriptase–polymerase chain reaction, and real-time quantitative polymerase chain reaction.11,29 These quantitative and qualitative aspects of microarrays can be exploited to screen for putative molecular markers of HNSCCA. An ideal candidate for such a marker would be a transcript that is both robust in its expression and specific among all HNSCCA and normal tissue. In our study, we identified 2 genes that fulfilled these criteria. In both genes, significant expression was reproducibly observed in all 9 tumors, whereas these genes were virtually undetectable in their corresponding, adjacent nonmalignant tissues (Table 4). The identity of the first gene was the human collagen type XI α1 gene (COL11A1), which encodes an essential component of the interstitial extracellular matrix. Consistent with our study, this gene was detected in 27 (79%) of 34 colorectal cancer tissues, whereas it was not present in normal colon tissue.30 Its role in malignancy, however, still remains unclear. The second gene (accession No. D79998) was a cDNA clone identified in the human myeloblast cell line KG-1.31 Interestingly, this gene is unrelated to any previously reported genes. Its 3.6-kilobase sequence reveals only partial homology to human chromosome 17 and incomplete homology with the remainder of the current human genome database. Therefore, we named this putative gene HSCA-1 (head and neck squamous carcinoma–associated gene), since this gene is highly likely to represent a novel gene. In the future, we will design specific oligonucleotide primer sets for HSCA-1 and COL11A1 to validate these findings by reverse transcriptase–polymerase chain reaction on future specimens of HNSCCAs.

The use of microarray analysis in determining gene expression in cancer has been applied in the past for several different cancers, such as colon,17 breast,18,20,21 B-cell lymphoma,19 and, more recently, lung32 and head and neck cancers.3335 Through the use of this technique, Villaret et al33 identified 13 genes that are overexpressed in HNSCCA when compared with normal tissue, of which 4 of these genes were novel. Using cluster analysis, Belbin et al34 found 337 genes that distinguish 2 subgroups of HNSCCAs. They also identified 4 EST segments that clustered to only one subgroup.

Through cDNA microarray analysis and using the combined t test–bayesian statistical analysis, we were able to identify 227 genes, including 46 EST segments that were either overexpressed or underexpressed in HNSCCA when compared with normal epithelial tissue of the same patient. In a similar study, Al Moustafa et al35 compared HNSCCA samples with normal tissue from the same patient and identified 213 statistically significant genes, 91 overexpressed and 122 underexpressed. Two genes of note that were underexpressed in HNSCCA were Claudin-7 and Connexin 31.1. In our study, we also find Caludin-7 to be underexpressed. Yet, we identify COL11A1 and the novel HSCA-1 gene as being overexpressed solely in HNSCCAs.

The gene expression profile of tumors can be performed using either RNA extracted from whole tumor tissue or that from isolated tumor cells. Pure population of tumor cells and normal cells can be isolated using the technique of laser capture microdissection.11,36 This strategy is essential to identifying unique tumor markers, which may serve as potential therapeutic targets of cancer cells.37 However, because the clinical behavior of a particular tumor depends on the characteristics of both tumor cells and stromal tissue, such as density of angiogenesis and lymphocyte infiltration, gene expression profile of whole tumor tissue is required to elucidate prognostic markers and biologic variations.

The ability to obtain quantitative information from multiple transcriptional profiles represents an exceptionally powerful means to explore cancer biology and to generate gene databases relevant to HNSCCA. Despite the challenges of heterogeneity in patient tissues, genes expected to play a role in tumorigenesis were selected, organized by their predicted protein localization, and clustered based on similarity. Such a model represents an exciting venture for further investigation, yet it is critical to be aware of the limitations imposed by microarray studies. The observed gene expression may reflect either increased transcription or decreased mRNA stability by the reference tissue. Certain transcripts may not necessarily encode bioactive proteins. Moreover, microarray experiments are vulnerable to problems of statistical inferences because many genes will show significant changes in gene expression purely by chance alone. Therefore, to generate biologically meaningful data, it was necessary to use additional statistical algorithms such as the bayesian analysis.12

These caveats notwithstanding, DNA microanalysis was an effective and efficient screening method to identify global gene expression profile. Several previously known genes associated with cancer were identified, supporting our statistical and experimental approach. In addition, 2 genes exclusively expressed in HNSCCA were identified: an isoform of the collagen gene observed in colon cancers and a putative novel gene, each of these molecules representing potential targets for further investigation. Taken together, quantitative analysis of gene expression by cDNA microarray analysis of human HNSCCA has produced a preliminary, comprehensive database of tumor-specific gene expression profiles and provides important insights into modeling gene expression changes implicated in carcinogenesis. A large-scale analysis of gene expression carries the future potential of identifying sensitive molecular markers for early tumor detection and novel targets for interceptive therapeutics.

Back to top
Article Information

Corresponding author and reprints: Fang-An Chen, MD, PhD, Department of Otolaryngology, New York University School of Medicine, Suite 7U, Skirball Building, 530 First Ave, New York, NY 10016 (e-mail: fangan.chen@med.nyu.edu).

Accepted for publication January 15, 2003.

This study was supported in part by the George E. Hall Research Fund. Dr Sok was partially supported by a Master Student Training Program grant from the National Institutes of Health, Bethesda, Md.

This study was presented at the annual meeting of the American Head and Neck Society Meeting, Boca Raton, Fla, May 11-13, 2002.

We gratefully acknowledge the assistance of J. Denis Heck, PhD, and Kim Nguyen, PhD, with the microarrays.

References
1.
Kinzler  KWVogelstein  B Lessons from hereditary colorectal cancer. Cell.1996;87:159-170.
PubMed
2.
Vogelstein  BKinzler  KW The multistep nature of cancer. Trends Genet.1993;9:138-141.
PubMed
3.
Strausberg  RL The Cancer Genome Anatomy Project: new resources for reading the molecular signatures of cancer. J Pathol.2001;195:31-40.
PubMed
4.
Strausberg  RLBuetow  KHEmmert-Buck  MRKlausner  RD The Cancer Genome Anatomy Project: building an annotated gene index. Trends Genet.2000;16:103-106.
PubMed
5.
Liang  PPardee  AB Differential display of eukaryotic messenger RNA by means of the polymerase chain reaction. Science.1992;257:967-971.
PubMed
6.
Velculescu  VEZhang  LVogelstein  BKinzler  KW Serial analysis of gene expression. Science.1995;270:484-487.
PubMed
7.
Lockhart  DJDong  HByrne  MC  et al Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol.1996;14:1675-1680.
PubMed
8.
Villaret  DBWang  TDillon  D  et al Identification of genes overexpressed in head and neck squamous cell carcinoma using a combination of complementary DNA subtraction and microarray analysis. Laryngoscope.2000;110(3 pt 1):374-381.
PubMed
9.
Leethanakul  CPatel  VGillespie  J  et al Distinct pattern of expression of differentiation and growth-related genes in squamous cell carcinomas of the head and neck revealed by the use of laser capture microdissection and cDNA arrays. Oncogene.2000;19:3220-3224.
PubMed
10.
Hanna  EShrieve  DCRatanatharathorn  V  et al A novel alternative approach for prediction of radiation response of squamous cell carcinoma of head and neck. Cancer Res.2001;61:2376-2380.
PubMed
11.
Alevizos  IMahadevappa  MZhang  X  et al Oral cancer in vivo gene expression profiling assisted by laser capture microdissection and microarray analysis. Oncogene.2001;20:6196-6204.
PubMed
12.
Long  ADMangalam  HJChan  BYTolleri  LHatfield  GWBaldi  P Improved statistical inference from DNA microarray data using analysis of variance and a Bayesian statistical framework: analysis of global gene expression in Escherichia coli K12. J Biol Chem.2001;276:19937-19944.
PubMed
13.
Sokal  RRSheath  PHA Principles of Numerical Taxonomy.  San Francisco, Calif: WH Freeman; 1963.
14.
Eisen  MBSpellman  PTBrown  POBotstein  D Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A.1998;95:14863-14868.
PubMed
15.
Eisen  M Cluster version 2.11 and TreeView version 1.50 [computer program].  Available at: http://rana.lbl.gov/EisenSoftware.htm. Accessed February 2, 2002.
16.
Golub  TRSlonim  DKTamayo  P  et al Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science.1999;286:531-537.
PubMed
17.
Alon  UBarkai  NNotterman  DAGish  KMack  YDLevine  AJ Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci U S A.1999;96:6745-6750.
PubMed
18.
Sgroi  DCTeng  SRobinson  GLeVangie  RHudson  JRElkahloun  AG In vivo gene expression profile analysis of human breast cancer progression. Cancer Res.1999;59:5656-5661.
PubMed
19.
Alizadeh  AAEisen  MBDavis  RE  et al Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature.2000;403:503-511.
PubMed
20.
Perou  CMSorlie  TEisen  MB  et al Molecular portraits of human breast tumours. Nature.2000;406:747-752.
PubMed
21.
van ‘t Veer  LJDai  Hvan de Vijver  MJ  et al Gene expression profiling predicts clinical outcome of breast cancer. Nature.2002;415:530-536.
PubMed
22.
Arfin  SMLong  ADIto  ET  et al Global gene expression profiling in Escherichia coli K12: the effects of integration host factor. J Biol Chem.2000;275:29672-29684.
PubMed
23.
Hanahan  DWeinberg  RA The hallmarks of cancer. Cell.2000;100:57-70.
PubMed
24.
Koch  WM Clinical implications of biomarkers in head and neck cancer. Curr Oncol Rep.1999;1:129-137.
PubMed
25.
Smith  BDHaffty  BGSasaki  CT Molecular markers in head and neck squamous cell carcinoma: their biological function and prognostic significance. Ann Otol Rhinol Laryngol.2001;110:221-228.
PubMed
26.
Smith  BDSmith  GLCarter  D  et al Molecular marker expression in oral and oropharyngeal squamous cell carcinoma. Arch Otolaryngol Head Neck Surg.2001;127:780-785.
PubMed
27.
Helliwell  TR Molecular markers of metastasis in squamous carcinomas. J Pathol.2001;194:289-293.
PubMed
28.
Forastiere  AKoch  WTrotti  ASidransky  D Head and neck cancer. N Engl J Med.2001;345:1890-1900.
PubMed
29.
Zhu  HCong  JPMamtora  GGingeras  TShenk  T Cellular gene expression altered by human cytomegalovirus: global monitoring with oligonucleotide arrays. Proc Natl Acad Sci U S A.1998;95:14470-14475.
PubMed
30.
Fischer  HStenling  RRubio  CLindblom  A Colorectal carcinogenesis is associated with stromal expression of COL11A1 and COL5A2. Carcinogenesis.2001;22:875-878.
PubMed
31.
Nagase  TSeki  NIshikawa  KTanaka  ANomura  N Prediction of the coding sequences of unidentified human genes, V: the coding sequences of 40 new genes (KIAA0161-KIAA0200) deduced by analysis of cDNA clones from human cell line KG-1. DNA Res.1996;3:17-24.
PubMed
32.
Sugita  MGeraci  MGao  B  et al Combined use of oligonucleotide and tissue microarrays identifies cancer/testis antigens as biomarkers in lung carcinoma. Cancer Res.2002;62:3971-3979.
PubMed
33.
Villaret  DBWang  TDillon  D  et al Identification of genes overexpressed in HNSCC using a combination of complementary DNA subtraction and microarray analysis. Laryngoscope.2000;110:374-381.
PubMed
34.
Belbin  TSingh  BBerber  I  et al Molecular classification of HNSCC using cDNA microarrays. Cancer Res.2002;62:1184-1190.
PubMed
35.
Al Moustafa  AAlaoui-Jamali  MBatist  G  et al Identification of genes associated with head and neck carcinogenesis by cDNA microarray comparison between matched primary normal epithelial and squamous carcinoma cells. Oncogene.2002;21:2634-2640.
PubMed
36.
Bonner  RFEmmert-Buck  MCole  K  et al Laser capture microdissection: molecular analysis of tissue. Science.1997;278:1481, 1483.
PubMed
37.
Sugiyama  YSugiyama  KHirai  YAkiyama  FHasumi  K Microdissection is essential for gene expression profiling of clinically resected cancer tissues. Am J Clin Pathol.2002;117:109-116.
PubMed
×