Cluster plot of gene expression (y-axis) in peripheral blood for 15 patients with ischemic stroke (3 samples for each patient) compared with 8 control subjects (2 samples for each control) (x-axis). Patients with stroke were entered in the CLEAR Trial. For them, the first whole-blood sample was drawn into PAXgene tubes (a new type of vacutainer tube; Qiagen, Studio City, Calif) from 1 to 3 hours after the onset of ischemic stroke (labeled 3). A second blood sample was drawn at 5 hours (labeled 5), and a third blood sample was drawn at 24 hours (labeled 24). Control patients (right side) had 2 blood samples drawn 1 day apart, except for control patient 6 (only 1 sample drawn). Patient numbers and the times the blood samples were drawn are shown on the x-axis. More than 1300 genes were regulated in the blood of patients with stroke compared with controls. Prediction analysis of microarrays software determined that the probe sets shown, which represent 18 genes, best predicted stroke compared with control samples. The probe sets for the Affymetrix arrays (Affymetrix Inc, Santa Clara, Calif) are listed on the y-axis. A 5-fold increase in gene expression is shown in bright red; a 5-fold decrease, in deep green. A indicates patients taking aspirin before the ischemic stroke; B, African American patients. The red squares above the graph indicate samples that were not significantly different from controls.
Gene expression in individual cell types (x-axis) for the clinical disorders ischemic stroke, Tourette syndrome (TS), and migraine (y-axis). The genes most regulated in ischemic stroke, TS, and migraine were identified from our previous studies of gene regulation in blood. These genes were then compared with our library of genes that are expressed in individual cell types in blood. The plot shows that neutrophils and monocytes account for the genes most regulated in ischemic stroke. Genes expressed in natural killer (NK) cells and cytotoxic CD8 T cells are highly expressed in patients with TS. Patients with migraine have high expression of genes found in platelets. Increased expression is in bright red. PMN indicates polymorphonuclear leukocyte.
Gene expression in whole blood (y-axis) of children with Duchenne muscular dystrophy (DMD) and spinal muscular atrophy type II (SMA II). Forty children with DMD and 8 children with SMA II had blood drawn into PAXgene tubes (a new type of vacutainer tube; Qiagen, Studio City, Calif) and gene expression was assessed on Affymetrix (Affymetrix Inc, Santa Clara, Calif) human microarrays. Comparison of gene expression in DMD vs SMA II showed 62 regulated genes as determined by a false discovery rate with P<.05 and a greater than 1.5-fold change. These 62 genes were then subjected to the cluster analysis shown in this figure for all of the children with DMD and SMA II. There are roughly 3 groups as manifested by differences of gene expression. Two of the groups consist of children with DMD only, and these 2 DMD groups have different patterns of gene expression in peripheral blood. The third group is a mixture of children with SMA II (red color in bar below graph) and DMD (purple color) who have similar patterns of gene expression in their peripheral blood. In the graphs, the color key extremes are represented by red (indicating 5-fold up-regulation) and blue (5-fold down-regulation).
Sharp FR, Xu H, Lit L, Walker W, Apperson M, Gilbert DL, Glauser TA, Wong B, Hershey A, Liu D, Pinter J, Zhan X, Liu X, Ran R. The Future of Genomic Profiling of Neurological Diseases Using Blood. Arch Neurol. 2006;63(11):1529-1536. doi:10.1001/archneur.63.11.1529
Sequencing of the human genome and new microarray technology make it possible to assess all genes on a single chip or array. Recent studies show different patterns of gene expression related to different tissues and diseases, and these patterns of gene expression are beginning to be used for diagnosis and treatment decisions in various types of lymphoid and solid malignancies. Because of obvious problems obtaining brain tissue, progress in genomics of neurological diseases has been slow. To address this, we demonstrated that different types of acute injury in rodent brain produced different patterns of gene expression in peripheral blood. These animal studies have now been extended to human studies. Two groups have shown that there are specific genomic profiles in the blood of patients after ischemic stroke that are highly sensitive and specific for predicting stroke. Other recent studies demonstrate specific genomic profiles in the blood of patients with Down syndrome, neurofibromatosis, tuberous sclerosis, Huntington disease, multiple sclerosis, Tourette syndrome, and others. In addition, data demonstrate specific profiles of gene expression in the blood related to different drugs, toxins, and infections. Although all of these studies are still preliminary basic scientific endeavors, they suggest that this approach will have clinical applications to neurological diseases in humans.
The invention of microarrays made it possible to assess all of the genes expressed in an organism or the tissue from a single sample of RNA. This technology was initially used to assess genome-wide expression in yeast and other model organisms.1 However, it was soon applied to humans with leukemia, lymphoma, and various types of cancers because these tissues were readily available and RNA could be readily extracted from them.2 It was soon realized that there were specific patterns of gene expression—the so-called profiles or signatures—that characterized different types of cancers and could also be used to predict the response of a malignancy to specific types of therapy.2
What is a gene profile? The clinician deals with profiles every day. Increases of troponin and creatine kinase levels in the blood are usually associated with a heart attack. In this case, the profile includes just 2 proteins. In the case of whole-genome studies, the diseased tissue is compared with healthy tissue, and the genes that are different are detected. The regulated genes in a profile can number as few as 10 or as many as thousands. Generally, if one examines the expression of these genes in the same disease and same tissue again and again, the same or similar genes are regulated in a given disease. In addition, there are different sets of genes for different diseases. These sets of genes, which are usually specific for a disease, a certain tissue, a time of development, a drug, or any other factor, are said to produce a profile or a signature for the factor. Hence, rather than having a creatine kinase profile for heart attack, there could be a panel of 15 to 25 genes for stroke. Rather than having just the cholesterol and low- and high-density lipoprotein profile for heart attack or stroke risk, there could be a panel of 50 to 100 genes that predict risk.
Because obtaining brain tissue from humans is difficult, we realized that sampling brain tissue would not be practical for most neurological diseases. Therefore, our laboratory3 performed a study aimed at answering the question of whether gene expression would change in peripheral white blood cells after acute brain injuries. We showed that there are changes of gene expression in peripheral blood monocytes at 24 hours after ischemic stroke, intracerebral hemorrhage, kainate-induced status epilepticus, hypoxia, and insulin-induced hypoglycemia in adult rats. A large number of genes regulated in peripheral blood were common to all of the types of injury, and these were proposed to be related to stress. More important, sets of genes were specific to each type of brain injury, but no single gene in the peripheral blood predicted the type of injury in the brain. Only when several hundred genes in peripheral blood were examined was it possible to distinguish ischemic stroke from intracerebral hemorrhage and hypoglycemia-induced brain injury.3 These studies provided a first proof of principle that different diseases in the brain (or any organ) are associated with specific profiles in the peripheral blood. The same will be true for all other organs and probably most other diseases.
Rather than reviewing all human genomic studies of all tissues and all diseases to date, this short review will focus on gene expression in the blood of humans with neurological diseases. Few studies of this kind have been published, and most of the published studies, including those from our own group, have flaws.
The studies4- 7 began with whole blood from which RNA was isolated within 10 to 30 minutes of the blood draw. Other studies have used a similar approach.8,9 However, 2 new types of vacutainer tubes (PAXgene [Qiagen, Studio City, Calif] and TEMPUS [Applied Biosystems, Foster City, Calif]) permit immediate stabilization of the RNA as soon as the blood is drawn into the tubes. Thus, the practicality of performing short-term studies has markedly improved10 and clinical studies in humans will soon be fairly easy and routine.
Whitney et al,8 of Stanford University, Stanford, Calif, were the first to use standard blood and RNA isolation techniques in 75 healthy volunteers to show that there were specific profiles in human blood that correlated with age and sex. Our group5 also examined the effect of sex on gene expression as one of the first and simplest assays of the technology and methods being used. We showed that blood samples from men and women had different patterns of gene expression, and that the expression of 2 genes on the Y chromosome was sufficient to differentiate donor sex in blood samples from more than 100 patients. In addition, we demonstrated that age had a profound effect on gene expression in peripheral blood, with children showing higher expression of genes associated with immunoglobulins.5
After the study in animals,3 Moore and colleagues9 at the National Institutes of Health, Bethesda, Md, performed a study of humans with acute ischemic stroke. In that study, peripheral blood monocytes (the same cell types as those isolated in the previous rat studies) were isolated from the blood of patients at various times after ischemic stroke. Moore and colleagues found that a panel of 22 genes using prediction analysis for microarrays classified stroke in the validation cohort with a sensitivity of 78% and a specificity of 80%. They also showed that reverse-transcriptase polymerase chain reaction analysis confirmed up-regulation in 9 of 9 genes in the stroke cohort.9
Our group10 has just published a similar study with some differences. Blood samples were collected from patients at 2 to 3, 5, and 24 hours after an ischemic stroke as part of a CLEAR Trial at the University of Cincinnati, Cincinnati, Ohio (principal investigator, Joe Broderick, MD). Blood was collected into PAXgene tubes to stabilize the RNA immediately. The tubes were left at room temperature for 2 hours and then stored frozen for a year until analyzed. The data showed that more than 1000 genes (of the approximately 33 000 possible) are regulated in the blood after ischemic stroke in humans.10 Of these genes, a panel of 21 probe sets representing 18 genes can correctly predict all control samples and can correctly predict 10 of 15 patient samples at 2 to 3 hours, 14 of 15 patient samples at 5 hours, and 15 of 15 patient samples at 24 hours after acute ischemic stroke.10 Some genes are regulated as a function of the National Institutes of Health Stroke Scale, previous aspirin use, race (African American vs white), recanalization, and other factors.10 Not reported in that study was the finding that the profile for patients for cardioembolic stroke appeared to be different from that for the patients with atherosclerotic stroke (H.X. and F.R.S., unpublished data, August 2006). Although this finding is still being analyzed, if confirmed, it provides a novel and sophisticated approach to delineateg cause and/or pathogenesis of stroke.
Figure 1 shows the data from the patients with ischemic stroke and the control subjects in that study.10 The x-axis shows patient samples (at 3, 5, and 24 hours after stroke) and 2 samples each for controls. The y-axis shows the probe sets for the 18 genes that best predicted stroke in the study.10 For many of the stroke patients, gene expression changed over time, ie, from 3 to 5 to 24 hours.
Although there were many overlapping genes in the study by Moore et al9 and in our stroke study,10 there were several differences as well. In part, this could have been because of differences in the microarray used and differences in the time after stroke in the 2 studies. However, we thought it was most likely because of differences in cell types: Moore et al9 studied peripheral blood monocytes and we examined whole blood.10 To determine which cell types accounted for the changes of gene expression in our study, we developed a library of gene expression of different cell types in the peripheral blood of healthy subjects.11 With this library, we showed that the genes regulated in the whole blood of patients with acute stroke that were most predictive of stroke were almost all regulated in polymorphonuclear cells (neutrophils), but some genes were regulated in monocytes.10,11
Figure 2 shows a plot for the cell types vs gene expression for 3 diseases, including stroke. In each case, the genes most regulated in ischemic stroke,10 Tourette syndrome (TS),6 and migraine12 are plotted against the cell types in which those genes are expressed. The cell types for each disorder differ, with neutrophils and monocytes being the cells most represented in acute stroke.
It is likely that a blood test for stroke will be developed and that the result will indicate whether the stroke was cardioembolic, atherosclerotic, or lacunar or had some other cause. It is likely that reverse-transcriptase polymerase chain reaction analysis or perhaps a multiprotein enzyme-linked immunosorbent assay will be performed rather than a microarray run, although techniques for rapid microarray processing have not even been considered until now. The profiles in blood could also be used to determine which patients with stroke or transient ischemic attack are most likely to have another stroke; the profiles might be used to predict overall risk even before a strokeor to tailor acute therapies. For example, a blood genomic profile might indicate that higher or lower doses of tissue plasminogen activator could be used or that the time window could be extended for longer periods. There are as many potential uses as there are questions.
Many studies, most of which predate the stroke studies, have shown genome-wide changes in the blood of patients with multiple sclerosis.13- 22 The results of these will be summarized rather than providing any detail. A signature or profile for the blood of patients with multiple sclerosis clearly differentiates them from controls, and a signature correlates with disease activity.13,16 In addition, the drugs used to treat multiple sclerosis produce reproducible genomic profiles in the blood.21 More important, gene profiles in blood correlate with or predict response to the drugs used to treat multiple sclerosis.14,20 All of these studies are of great interest because there appear to be unique profiles of gene expression in the blood of most patients with autoimmune diseases, including rheumatoid arthritis, systemic sclerosis, systemic lupus erythematosis, and probably all others.23
Gene profiling can also be used to study diseases with complex or multiple etiologies like TS, autism, and many others. We recently reported the findings of 16 patients with familial TS compared with 16 age-, sex-, and race-matched controls.6 We found a number of genes regulated in the blood of patients with TS. On reanalysis of these genes, we found a subgroup of patients with TS who express genes that are generally found in natural killer cells and cytotoxic CD8 T cells.6Figure 2 shows that the previously identified TS genes overexpressed in the blood of a subset of patients with TS are in natural killer and CD8 cells. In addition, several patients overexpress genes found in B cells (L.L., D.L.G., and F.R.S., unpublished data, August 2006). The importance of these data is that they show that blood genomic profiling will be useful for subgrouping patients with complex disorders like TS that likely have important genetic and environmental factors, including infections, that affect whether individuals manifest the disease.
We were the first, to our knowledge, to report that a genetic disease can produce a specific profile of gene expression in the peripheral blood.5,24 We examined patients with neurofibromatosis 1 and tuberous sclerosis type 2 because both diseases are easy to diagnose, neither has a routine genetic test, and both have genes expressed in peripheral blood. We also examined patients with Down syndrome to determine whether a trisomy would result in a reproducible genomic profile in peripheral blood.24 This study showed that there were specific profiles for the autosomal dominant diseases neurofibromatosis 1 and tuberous sclerosis type 2. This was not a surprise. However, we also found that patients with Down syndrome have a specific profile. This was an important finding because it suggested that any polygenic disease might well have a specific signature or genomic profile. This study did not determine whether diseases in which the genetic abnormality was confined to the brain would have a genomic profile in the peripheral blood.
Our study, however, did point out one additional important finding: different phenotypes of a given genetic disease appeared to have different blood genomic profiles. For example, individuals with Down syndrome who had congenital heart disease had a blood genomic profile different from those without heart disease. Similarly, individuals with tuberous sclerosis type 2 who also had autism had different blood genomic profiles compared with those without the autism. Thus, although the causative genes were presumably the same in these individuals, the other interacting genes in the genome resulted in different blood genomic profiles in individuals with different phenotypes of the same genetic abnormality. This points to the power of genomics to sort out which interacting genes might result in these different phenotypes.
A similar and much more comprehensive study was just published for a large number of patients with Huntington disease (HD).25 Blood genomic profiles were able to distinguish controls, patients with presymptomatic HD, and patients with symptomatic HD.25 Histone deacetylase inhibitor sodium phenylbutyrate treatment in patients with HD decreased their elevated messenger RNA levels. Finally, the genes up-regulated in peripheral blood were also up-regulated in the postmortem HD caudate, suggesting shared signaling mechanisms in the blood and brain at least in this disease.25 This is an impressive study because it used several different array platforms, confirmed gene expression, and correlated the gene changes with a number of important clinical variables.
Any drug or toxin may produce a characteristic change of gene expression in blood. In a study of children with epilepsy,4 we showed that there were characteristic changes of gene expression in the blood of children treated with carbamazepine or valproic acid. In addition, children who responded to valproic acid, ie, those who had no generalized seizures, had a different blood genomic profile than children who continued to have seizures.4
Drugs used to treat multiple sclerosis produce characteristic gene profiles in blood. The antidepressant drug venlafaxine hydrochloride changes gene expression in human blood that can be detected using microarrays and confirmed using reverse-transcriptase polymerase chain reaction analysis.26 Glucocorticoid therapy changes gene expression in blood,27 and a specific profile predicts glucocorticoid sensitivity in patients with asthma.28 Exposure to morphine,29 immunosuppressive drugs,30 benzene,31 and tobacco smoke32,33 is reported to produce characteristic profiles of gene expression in human blood. Patients with subacute sclerosing panencephalitis have gene expression profiles that are detectable in peripheral blood.34
We have reported a set of unique expression profiles in blood of children with migraine.12 A group of platelet-related genes were up-regulated in children with acute and chronic migraine. A specific group of mitochondrial-related genes were up-regulated in children with chronic migraine. The changes in blood of patients with migraine is not surprising because this is a complex genetic disorder.
Genomic profiling has the potential for providing biological markers for many of the still poorly defined and debated clinical entities. For example, Segman et al35 reported that gene expression profiles in peripheral blood identify an emergent posttraumatic stress disorder among trauma survivors. Gene expression changes may also be evident in the blood of patients with chronic fatigue syndrome.36
Almost any variable could and probably would affect gene expression in blood, including patient age, race, sex, diet, genetics, time to last meal, time of day, any medications used, and the like.5,8,37 For example, prolonged exercise has a profound effect on gene expression in human blood.38 Thus, any study must carefully match disease and control patients or include sufficient numbers so that these factors do not confound the results.
One problem with some previous studies is that the RNA was isolated at various times after the blood was drawn. This has been solved recently by the availability of commercial tubes that stabilize RNA immediately when it is drawn into the vacutainer tubes. However, different commercial tubes (PAXgene and TEMPUS) have different stabilizing reagents and are likely to produce different results. Thus, the same tubes and conditions must be used throughout a given study, and the same tubes must be used to compare between studies.
So much individual variation exists in blood genomic profiles that some might think these sorts of studies are virtually impossible. However, with great care to detail, reliable results can be obtained. A major problem with the field—not only for blood but also for genomic studies of all tissues—is that samples run at different times have a batch effect. That is, different batches of samples, even those run at the same facility with the same arrays, show systematic differences in gene expression. Batch effect is even greater when the samples are run at different facilities. One way to improve results is to run a concurrent sample of RNA from human blood and to run this batch control for every set of samples in a given study. This should be done even when all of the control and experimental samples are run as an entire batch over several weeks or months. In addition, it is best to interleave control and experimental samples, particularly if they are paired in the experimental design. A batch normalization can then be performed using current Affymetrix software (Affymetrix Inc, Santa Clara, Calif). In addition, we have developed a batch normalization method that is coarse but effective (Brenda Wong, MD, W.W., D.L.G., and F.R.S., unpublished data, August 2006).
It is also crucial to use the same arrays for a given study and, if possible, even the same lot of arrays. In addition, one should process all of the samples together at the end of a study. The 2 current major array platforms are Affymetrix and Agilent (Agilent Technologies Inc, Palo Alto, Calif), with most reliable studies performed on the former. Price is a current drawback, but as competition continues, prices will steadily fall and become affordable for most investigators.
The greatest problems we have faced—and continue to face—are the small changes of gene expression in the blood and the relative inability to compare one study with another because of batch effect differences. Isolation of individual cell types may provide more detectable changes of gene expression and may also provide more consistent results for blood genomic studies in the future.11 Methods for rapidly and reliably isolating specific cell types, however, are still inadequate. The methods are frought with technical problems11 and interpretation problems.
The inabilities to compare across different batches and across studies are still major obstacles to this technology being used routinely in clinical work. The current methods will allow for scientific investigation but are not yet robust enough for routine clinical use. An analytical breakthrough is needed so that different studies and different arrays can be compared directly.
As the numbers of studies using blood increase, it is important to emphasize that various internal validations must be performed, even within a given study. For example, in the studies of genetic diseases and of stroke, it was important to not only describe the genes regulated for a given disease but also use the prediction analysis of microarrays software to determine the minimum number of genes that predicted the disease samples compared with the control samples.9,10,25 This approach simply shows that the genes differentially expressed between the groups can be used to predict the classes of the same samples.
An additional validation is take the results of the gene expression changes in one cohort of samples and use those genes to predict the disease or control samples in a second cohort. This validation was performed by Moore et al9 for their stroke study and by Borovecki et al25 for their HD study. These validation studies require a minimum of 40 to 60 subjects to ensure a minimum of 10 patients with the disease and 10 controls in the first comparison to derive the genes regulated between disease and control samples. These genes would then be used to predict the class of the 10 disease and 10 control samples in the second cohort. Hence, any future human study probably should not have fewer than 10 subjects per group or fewer than a total of 40 to 60 subjects, even if there are only 2 comparisons between patients with disease and controls matched for age, sex, and race.
Gene expression changes in the blood of patients with stroke are sufficiently robust that it seems likely they can form the basis for the development of a stroke blood test usable for diagnosis and prediction of the source or cause of the stroke, treatment, future stroke, and similar variables.
Many unforeseen findings are still likely. For example, in collaboration with Drs Wong and Gilbert at the University of Cincinnati, we have examined gene expression in the blood of 40 children with Duchenne muscular dystrophy (DMD) and 8 children with spinal muscular atrophy type II, and 28 teenage controls. By comparing the genes regulated in DMD vs spinal muscular atrophy type II, we found 62 regulated genes. When these genes were used to cluster the patients with DMD and spinal muscular atrophy type II, 3 groups were found (Figure 3). Two of the 3 groups were composed of patients wth DMD only, but these 2 DMD groups had different patterns of gene expression in the blood (Figure 3). Surprisingly, a third group of patients with DMD intermixed with the patients with spinal muscular atrophy type II (Figure 3). Exactly what these 3 groups mean in terms of clinical phenotype is uncertain, but they are clearly related to different immune responses in the blood of these children. The future challenge will be to determine whether this suggests different immune responses in these subgroups and whether this could be translated into different immune treatments for each of the 3 groups.
Correspondence: Frank R. Sharp, MD, MIND Institute, University of California–Davis Medical Center, 2805 50th St, Room 2416, Sacramento, CA 95817 (firstname.lastname@example.org or email@example.com).
Accepted for Publication: June 9, 2006.
Author Contributions:Study concept and design: Sharp, Xu, Lit, Walker, Apperson, Gilbert, Glauser, Wong, Hersey, D.-Z. Liu, Pinter, Zhan, X. Liu, and Ran. Drafting of the manuscript: Sharp, Xu, Lit, Walker, Apperson, Gilbert, D.-Z. Liu, Pinter, Zhan, X. Liu, and Ran. Critical revision of the manuscript for important intellectual content: Glauser, Wong, and Hersey. Administrative, technical, and material support: Sharp, Xu, Lit, Walker, Apperson, Gilbert, Glauser, Wong, Hersey, D.-Z. Liu, Pinter, Zhan, X. Liu, and Ran.
Financial Disclosure: None reported.
Funding/Support: The study was supported in large part by start-up funds from the University of Cincinnati and the University of California–Davis; by grants from the National Institutes of Health (Drs Sharp, Glauser, and Broderick); by a Bugher Award from the American Heart Association (Dr Sharp); and by grants from the Tourette's Syndrome Association (Drs Gilbert and Sharp).