A functional group analysis of gene expression of human donor corneal endothelium showing the top 100 expressed transcripts. EST indicates expressed sequence tag; t RNA, transfer RNA.
Venn diagram representation of Uni Gene clusters identified by serial analysis of gene expression (SAGE) and microarray. Numbers in parentheses are the total number identified by each method.
Gottsch JD, Seitzman GD, Margulies EH, Bowers AL, Michels AJ, Saha S, Jun AS, Stark WJ, Liu SH. Gene Expression in Donor Corneal Endothelium. Arch Ophthalmol. 2003;121(2):252-258. doi:10.1001/archopht.121.2.252
EDWIN M.STONEMD, PhDFrom the Center for Corneal Genetics, Cornea and External Disease Service, The Wilmer Eye Institute (Drs Gottsch, Seitzman, Jun, Stark, and Liu and Mss Bowers and Michels), and Johns Hopkins Oncology Center Molecular Genetics Laboratory, The Johns Hopkins School of Medicine (Dr Saha), Baltimore, Md; and Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Md (Dr Margulies).
To report gene expression profiles of normal human corneal endothelium with microarray analysis and serial analysis of gene expression (SAGE).
Corneal endothelium was removed from normal human corneas obtained from eye banks. Total RNA was isolated and SAGE analysis was performed. The same RNA source was used to construct a complementary DNA library that was hybridized to microarrays containing 12 558 transcripts.
A total of 9530 SAGE tags were sequenced, representing 4724 unique tags. Microarray analysis identified 542 distinct transcripts. A database of human corneal endothelial gene expression was compiled. Of the SAGE tags, 1720 matched known genes, 478 corresponded to expressed sequence tags, and 2526 had no known match to public databases. The 5 most abundantly expressed SAGE tags were cytochrome c oxidase subunit II, adenosine triphosphate synthase F0 subunit 6, carbonic anhydrase XII, 12S ribosomal RNA, and ferritin, heavy polypeptide 1. Thirty-four percent of the transcripts(n = 1616) were specific to the corneal endothelium, when compared with other publicly available SAGE libraries. The 5 most abundant unique tags were keratin 12, angiopoietinlike factor, annexin A8, and 2 tags with no match to the database. Many endothelial pump function enzymes were confirmed, including several plasma membrane Na+/K+ adenosine triphosphatases and a recently reported bicarbonate transporter.
Corneal endothelial gene expression profiles by the current analysis provide an understanding of endothelial metabolism, structure, and function; enable comparisons to diseased endothelium; and provide baseline data that may lead to the discovery of novel endothelial genes.
THE ENDOTHELIUM is essential for corneal clarity. Fluid transport across this cell layer balances stromal hydration by an active electrolyte pump and the maintenance of a semipermeable membrane by cell-cell adhesion complexes.1 Descemet membrane is derived from the endothelium. Assessing endothelial molecular activity has been technically limited to evaluating only a few proteins or cellular transcripts at one time. Recently, the capability to globally assess gene expression in a given tissue has become feasible. Microarray analysis is a relatively rapid technique that has been useful in providing gene expression profiles for a number of tissues.2,3 A previous work described the use of complementary DNA (c DNA) microarray technology to provide a gene expression profile of the human cornea, characterizing some 1200 genes.4 Microarrays, however, are limited by the finite number of oligonucleotide sequences localized on a chip and are not able to identify the expression of novel genes.
Serial analysis of gene expression (SAGE) is another method that provides quantitative and comprehensive gene expression profiles. SAGE depends on the generation of short sequence tags at a specific location within a transcript.5,6 Through a series of standard enzymatic reactions, 10–base pair SAGE tags, which contain sufficient information to uniquely identify a gene, are generated, concatenated, and sequenced. By identifying genes corresponding to each tag and tabulating the frequency of each tag, the number of genes expressed and their expression level can be estimated. Novel genes can be suspected in a given tissue when tags cannot be matched to publicly available sequences. SAGE has the additional advantage of providing quantitative information about gene expression. In the present study, we sought to expand the gene expression profile of the corneal endothelium. In contrast to our previous report of a microarray analysis of the total cornea, herein we report the use of an expanded microarray chip and, for the first time, performance of SAGE analysis of normal donor corneal endothelium.
A c DNA library was constructed by standard methods from the endothelium of 15 pairs of intact donor corneas (mean ± SD age, 52 ± 12 years) from the Maryland Eye Bank, Baltimore. The endothelium was stripped from the stroma and stored in liquid nitrogen. Total RNA was isolated from corneal endothelium using monophasic phenol/guanidine isothiocyanate solution(TRIzol Reagent; Invitrogen, Carlsbad, Calif). Double-stranded c DNA was synthesized from 15 µg of messenger RNA according to manufacturer's protocol (Stratagene, Cedar Creek, Tex). Xho I and Eco RI linker-primer adapters were incorporated into the c DNA to create the restriction sites at the 5′ and 3′ ends of the c DNA. The c DNAwas size selected (>1 kb) by gel filtration, ligated into the Uni ZAPXR vector(Stratagene), and packaged with the use of extract (Gigapack II; Stratagene). The packaged c DNA was titered, and the number of clones contained in the primaryc DNA library was 1.0 × 106 plaque-forming units per milliliter.
Standard methods were used to recover phagemids by mass excision protocol(Stratagene). Approximately 1.6 × 106 plasmids were excised. The ratio of clones excised to the number of independent clones in the library was 1.6:1. Excised clones were used to transfect a large-volume bacterial SOLR cell culture (Stratagene), and plasmid preparations were performed by standard methods (QIAGEN, Valencia, Calif). Plasmids were digested by means of Not I restriction endonuclease (Gibco BRL Life Technologies), extracted with phenol-chloroform, and precipitated with ethanol.
Biotin-labeled c RNAs were produced by in vitro transcription (Enzo Diagnostics, Farmingdale, NY), digested with DNase I (Gibco-BRL Life Technologies), and purified by means of RNeasy spin columns (QIAGEN). Analysis of biotin-labeledc RNAs by means of the HU95a microarray (Affymetrix, Santa Clara, Calif) was performed in duplicate (Research Genetics, Huntsville, Ala) by hybridizing the same labeled c RNA sample to 2 identical microarrays within a 6-week period.
Three pairs of intact donor corneas (from a 79-year-old woman and 58- and 66-year-old men) were obtained from Maryland Eye Bank. The corneal endothelium of each donor eye was examined and photographed. All donor corneas were found to have normal endothelia and handled in an identical manner. The endothelium was stripped from the stroma and immediately stored in liquid nitrogen until use. Normal endothelial SAGE libraries were constructed according to the SAGE protocol5,6 (http://www.sagenet.org/sage_protocol.htm). Total RNA from normal endothelium was isolated by direct lysis using monophasic phenol/guanidine isothiocyanate solution (Invitrogen). Messenger RNA was isolated from total RNA by standard methods (Fast Track 2.0; Invitrogen). Two micrograms of messenger RNA was reverse-transcribed to double-strandedc DNA (Superscript Choice Synthesis c DNA synthesis kit; Invitrogen) with a 5′-biotinylated oligo(d T)18 primer (Integrated DNA Technologies, Coralville, Iowa). Double-stranded c DNA was digested with Nla III (New England Bio Labs, Beverly, Mass); 3′ c DNAs were purified with magnetic beads and split into 2 equal pools, and biotinylated SAGE linkers 1 and 2 (Integrated DNA Technologies) were ligated to pools 1 and 2, respectively. The SAGE tags were released with the tagging enzyme BsmFI (New England Bio Labs) and blunt ends were synthesized using Klenow polymerase fragment. The tags from pools 1 and 2 were ligated to each other overnight at 16°C. A 1:200 dilution of the ligation product was amplified with 35 cycles of polymerase chain reaction (PCR). Precipitated PCR products were separated on 12% polyacrylamide gel, and only the 102-bp band containing ditags was isolated. The ditags were released from the linkers by digestion with NlaIII and purified by means of streptavidin magnetic beads.7 The products of the digestion were separated on a 12% polyacrylamide gel, and the 24- to 26-bp bands containing ditags were purified and used for self-ligation overnight at 16°C. Concatamers were run on an 8% polyacrylamide gel and a fraction from 500 bp to 1 kb was isolated and cloned into p ZERO vector (Invitrogen) digested with SphI. Ligation mixtures were electroporated into Escherichia coli strain TOP10 F′ (Invitrogen), and colonies were screened for inserts larger than 500 bp by means of colony PCR with M-13 forward and reverse primers. The PCR products from selected clones were sequenced by means of an automated sequencer (ABI 3700; Applied Biosystems, Foster City, Calif).
By means of the statistical methods used by Affymetrix, individual probe sets were scored as either absent, marginal, or present in the endothelial sample. Probe sets were considered to be represented in the sample if they were marginal or present in both replicate microarray experiments. Probe sets were mapped to Uni Gene clusters by parsing the description line with a Perl script. Updated probe set descriptions were obtained at http://www.Net Affx.com. Fifteen probe sets represented in the endothelial sample had no Uni Genecluster match and could therefore not be included in the analysis with the SAGE data.
The ehm tag mapping method8 (http://genome.nhgri.nih.gov/ehm Tag Mapping) was used to match SAGE tags with specific Uni Gene clusters. This method is implemented through the use of several Perl scripts designed to extract tag-to-Uni Gene cluster information from the Uni Gene flatfiles available at the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/Uni Gene). Briefly, the ehm tag mapping method extracts a SAGE tag from each sequence in a Uni Gene cluster, only if the orientation and 3′ end of the sequence can be confirmed by identifying poly(A) signals and/or tails. To minimize the extraction of SAGE tags from entries with potential sequencing errors, SAGE tags not representing at least 20% of all tags extracted from a given Uni Gene cluster are removed from the final ehm tag mapping flatfile. For the purpose of comparing SAGE data with Affymetrix data, low-complexity(NAAAAAAAAA) and unreliable tag-to-Uni Gene cluster matches (tag matches only observed once) were removed from the final ehm tag mapping. Occasionally, one tag will reliably match more than one Uni Gene cluster. Using the Uni Geneand Locus Link databases at National Center for Biotechnology Information(http://www.ncbi.nlm.nih.gov), functional annotations were assigned to SAGE tags with the use of the associated gene ontology category (http://www.geneontology.org/) as a guide.
A total of 9530 SAGE tags were sequenced from the normal corneal endothelium(Table 1), representing 4724 unique transcripts. Microarray analysis identified the expression of 542 distinct transcripts among 12 558 total transcripts represented on the chip. A database of human corneal endothelial gene expression was compiled. Of the SAGE tags, 1720 (36%) matched known genes, 478 (10%) matched expressed sequence tags, and 2526 (54%) had no known match to publicly available databases. Of the 4724 unique tags, 3843 (81%) were represented by single copies, 774 (16%) by 2 to 10 copies, 88 (2%) by 11 to 46 copies, 14 (0.3%) by 47 to 79 copies, and 5 (0.1%) by greater than 79 copies (Table 2).
The 100 most abundant SAGE tags of human normal corneal endothelium are represented in Table 3. The percentage abundance for each of the genes ranged from 0.13% to 1.96%. The top 100 expressed transcripts were subdivided by cellular function. Ribosomal protein/RNA binding, transfer RNA, DNA binding, and gene regulation represented 28% of expressed tags; cellular metabolism and mitochondrial transcripts, about 18%; cell communication, cell growth/regulation, and cytoskeletal, 18%; expressed sequence tags, 8%; and genes with unknown functions, 7% (Figure 1). No-match tags composed 21% of the total expressed tags of the 100 most abundantly expressed genes.
To determine which genes were specific to the cornea, we performed a virtual subtraction between our SAGE and microarray data and 8 other publicly available SAGE libraries (see the "SAGE Tag Matching" subsection in the "Methods" section). Thirty-four percent of the transcripts (n = 1616) were identified as being specific to the corneal endothelium. The 5 most abundant cornea-specific tags were keratin 12, angiopoietinlike factor, annexin A8, and 2 tags with no match to the database.
Genes, identified by SAGE, representing previously recognized endothelial functions are listed in Table 4.Pump function, cytoskeletal, and cell adhesion proteins are listed under fluid transport. Collagens, glycosaminoglycans, and proteoglycans are listed under basement membrane proteins. The complete SAGE and microarray endothelial databases can be accessed at http://www.Cornea Net.net.
Microarray detected 507 genes (Figure 2). Of these, 145 were also identified by SAGE. Three hundred sixty-two were detected by microarray only and 2164 transcripts were detected by SAGE only. Of the 100 most abundantly expressed transcripts detected by SAGE, 28 were also detected by microarray (Table 3).
In this study, SAGE and microarrays were used as complementary methods for assaying corneal endothelial gene expression. Since SAGE tags and Affymetrix probe sets could be mapped to corresponding Uni Gene clusters, we were able to compare the genes identified by each method. Because of the increased sensitivity of the SAGE method, greater than 4 times the number of expressed transcripts were detected when compared with microarray (2209 vs 507 Uni Gene clusters). There were 2164 Uni Gene clusters detected by SAGE that were not detected by microarray. One explanation for this bias is that the detection of a transcript by microarray relies on the statistical computation of data obtained from multiple, complex, and unpredictable in vitro hybridization assays, making it likely that numerous transcripts went undetected. Conversely, there were 362 Uni Gene clusters detected by microarray that were not detected by SAGE. Since the sensitivity of the SAGE method can be increased, simply by sequencing additional SAGE tags, it is likely that the number of Uni Gene clusters detected only by microarray will decrease as the number of total SAGE tags is increased.
Microarray was able to detect 25% of the 100 most abundantly expressed transcripts identified by SAGE. Some of these frequently expressed SAGE tags had no matches and may represent novel genes that would not have been available for detection on a microarray chip. Many of the 100 most abundantly expressed genes (18%) in the endothelium are those concerned with energy production including mitochondrial gene expression and glycolysis.
Corneal clarity is dependent on endothelial regulation of stromal hydration. The endothelium is a semipermeable membrane that regulates hydration by providing a barrier to the leakage of fluid into the stroma and by active ion transport across endothelial membranes.1,9- 16 The fluid transport across the endothelium is believed to be mediated by multiple distinct pumps, one involving Na+/K+ adenosine triphosphatase and another, carbonic anhydrase. A sodium ion concentration gradient is established by basolaterally located Na+/K+ ATPase.1,9- 13 Carbon dioxide is thought to diffuse across the endothelial membrane and with water is converted by carbonic anhydrase to bicarbonate and H+.16,17 Protons may leave the cell via a Na+/H+ exchanger, and bicarbonate efflux has been proposed to be driven by a Cl−/HCO3 exchanger.11,12 Several of these pump proteins were detected by SAGE (Table 4), including different Na+/K+ adenosine triphosphatases and carbonic anhydrase transcripts (Table 4). A recently discovered bicarbonate transporter that has been found in kidney, salivary glands, and others tissues16 was identified by SAGE. Na+/H+ lysosomal proton transporter was identified in the corneal endothelium, but a plasma membrane-bound proton exchanger was not identified with this SAGE library of 10 000 transcripts. A SAGE library sequenced to a greater depth may identify other pump function transcripts.
Another mechanism regulating fluid transport in the cornea is the endothelial barrier to fluid and electrolyte transport into the stroma.17 This endothelial barrier is a function of cytoskeletal proteins and intercellular adhesion structures. Protein components of this barrier were identified by SAGE and include tight and gap junctions (actins and connexin), cytoplasmic intermediate filaments (keratins and vimentin), and Ca2+-dependent cell-cell adhesion proteins (cadherin and catenin) (Table 4).
A notable function of the endothelium is its contribution to a basal lamina and to Descemet membrane. Transcripts that may code for proteins contributing to these structures are fibrous proteins including collagens and elastin transcripts(Table 4). Adhesive protein transcripts detected were those for laminin, fibronectin, and fibrillin. Glycosaminoglycans noted were hyaluronan, chondroitin sulfate, heparan, and keratan. Proteoglycans identified were lumican, decorin, and betaglycan (transforming growth factor β receptor III). Integrins as linker proteins whose external domains bind to the extracellular matrix were also detected by SAGE.
This study is the first, to our knowledge, to apply SAGE and microarray techniques to comprehensively assess gene expression in the human corneal endothelium. The results expand our understanding of endothelial metabolism, structure, and function. The full SAGE dataset can be accessed at the National Center for Biotechnology Information gene expression Omnibus repository with accession numbers GSM1652 for normal corneal endothelium. We have also added these gene expression profiles to an online database, http://www.Cornea Net.net, so that theoretic comparisons with other SAGE analyses in corneal diseases such as corneal dystrophies and graft rejection can be performed, possibly suggesting further areas of laboratory study.
Corresponding author: John D. Gottsch, MD, The Wilmer Eye Institute, The Johns Hopkins Hospital, 600 N Wolfe St, Baltimore, MD 21287 (e-mail: email@example.com).
Submitted for publication April 4, 2002; final revision received August 30, 2002; accepted October 23, 2002.
This study was supported in part by a grant from Research to Prevent Blindness Inc, New York, NY (Dr Gottsch) and by the Helen and Raymond Kwok Research Fund (Dr Stark).