Comprehensive resequencing of the causative and disease-related genes of neurodegenerative diseases is expected to enable (1) comprehensive mutational analysis of familial cases, (2) identification of sporadic cases with de novo or low-penetrant mutations, (3) identification of rare variants conferring disease susceptibility, and ultimately (4) better understanding of the molecular basis of these diseases.
To develop a microarray-based high-throughput resequencing system for the causative and disease-related genes of amyotrophic lateral sclerosis (ALS) and other neurodegenerative diseases.
Validation of the system was conducted in terms of the signal-to-noise ratio, accuracy, and throughput. Comprehensive gene analysis was applied for patients with ALS.
Ten patients with familial ALS, 35 patients with sporadic ALS, and 238 controls.
The system detected point mutations with 100% accuracy and completed the resequencing of 270 kilobase pairs in 3 working days with greater than 99.9% accuracy of base calls, or the determination of base(s) at each position. Analysis of patients with familial ALS revealed 2 SOD1 mutations. Analysis of the 35 patients with sporadic ALS revealed a previously known SOD1 mutation, S134N, a novel putative pathogenic DCTN1 mutation, R997W, and 9 novel variants including 4 nonsynonymous heterozygous variants consisting of 2 in ALS2, 1 in ANG, and 1 in VEGF that were not found in the controls.
The DNA microarray–based resequencing system is a powerful tool for high-throughput comprehensive analysis of causative and disease-related genes. It can be used to detect mutations in familial and sporadic cases and to identify numerous novel variants potentially associated with genetic risks.
With recent progress in human molecular genetics, many causative genes of inherited neurological diseases have been identified. In 2007, 667 neurological diseases were registered in the Online Mendelian Inheritance in Man database (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM) as diseases with identified causative genes. It should be noted that there are substantial nonallelic genetic heterogeneities in hereditary neurodegenerative diseases, including amyotrophic lateral sclerosis (ALS), Parkinson disease, Alzheimer disease, and hereditary spastic paraplegia. Thus, there is a strong demand for comprehensive mutational analysis of multiple causative genes in daily clinical practice.
Most neurodegenerative diseases are sporadic and their molecular etiologies remain unknown. Although genome-wide association studies (GWAS) using common variants of single nucleotide polymorphisms have been undertaken to identify the loci of disease-susceptibility genes, genetic risks associated with rare variants may not be captured by GWAS.1 Identification of multiple rare variants, however, would need comprehensive resequencing of candidate genes. Furthermore, sporadic diseases may be caused by de novo mutations or low-penetrant mutations in the causative genes. Taken together, development of a comprehensive resequencing system of causative genes will be indispensable, not only to provide mutational analyses of multiple causative genes for familial diseases, but also to explore the molecular basis of sporadic diseases.
A DNA microarray–based resequencing method has been invented to enable rapid and accurate nucleotide sequence analysis of multiple genes spanning 30 to 300 kilobase pairs.2,3 We used this method to develop a comprehensive high-throughput resequencing system focusing on ALS as well as other neurodegenerative diseases. We herein describe the development of the microarray-based comprehensive resequencing system and its application to ALS genetics to validate the above-described concepts. We also discuss the implications of comprehensive resequencing for the molecular dissection of neurological diseases.
We have designed a microarray, TKYALS01, that primarily focuses on the causative genes of and genes related to ALS (Table 1). The sequences tiled on the microarray included the sequences of all of the exons and 12 flanking base pairs (bp) of the splice junctions. Promoter sequences were also included in the tiled sequences for genes whose expression levels were presumed to modify the disease processes.4,5 In addition, another microarray, TKYPD01, was designed to focus on genes relevant to Parkinson disease, autosomal-dominant hereditary spastic paraplegias, and adrenoleukodystrophy (data not shown).
Because the principle of the resequencing microarray is based on sequencing by hybridization (SBH), it is crucially important to avoid cross-hybridization to increase the accuracy of resequencing. For this purpose, we conducted an “in silico” screening to compare the tiled sequences with a sliding 25-nucleotide window to detect the sequences with an identity exceeding 22 bases in the tiled sequences and optimized the design of the microarrays and polymerase chain reaction (PCR) primers.
Thirty-five patients with sporadic ALS and 10 patients with familial ALS, 7 with autosomal dominant mode of inheritance and 3 with affected siblings, were enrolled in this study. The diagnosis of ALS was based on El Escorial and the revised Airlie House diagnostic criteria. A total of 238 control genomic DNA samples were also used.
Thirty-six genomic DNA samples with previously determined mutations of SOD1 (OMIM 147450), the causative gene of familial ALS,6-9 or those of ABCD1 (OMIM 300371), the causative gene of adrenoleukodystrophy,10-13 were anonymized and subjected to analysis without prior information on the mutations.
All of the genomic DNA samples were obtained with written informed consent, and this research was approved by the institutional review board of the University of Tokyo.
Specific PCR primers were designed using the Primer3 Web site (http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi)(eTable). Touch-down PCR protocols were used to enhance the specificity of PCR amplification (eTable). Each PCR product was quantified using PicoGreen (Molecular Probes, Eugene, Oregon), pooled equimolarly into 1 tube using a robotic system, BioMek FX (Beckman Coulter, Fullerton, California), and subjected to SBH according to the manufacturer's instructions (Affymetrix, Santa Clara, California) (Figure 1). The undetermined base calls were further analyzed by manual inspection of the signals. The resequencing of ANG (OMIM 105850) and the confirmation of all of the sequence variants determined by SBH were conducted by direct nucleotide sequence analysis using an automated DNA sequencer and BigDye Terminator version 3.1 (Applied Biosystems, Foster City, California). Analyses of frequency of variants in the controls were conducted by denatured high-performance liquid chromatography (Transgenomics, Omaha, Nebraska).
Establishment of high-throughput comprehensive resequencing system
To evaluate the signal-to-noise ratio, all of the PCR amplicons for TKYALS01 except those for SNCG were subjected to hybridization to TKYALS01 and scanning. Simultaneous hybridization of the mixed PCR amplicons did not interfere with the signals and SOD1 mutations were unambiguously identified (Figure 2). Furthermore, the areas where the probes for SNCG were tiled did not show any detectable signals, indicating that cross-hybridization was negligible.
As shown in Figure 3A, all of the point mutations were correctly identified, confirming the accuracy of SBH for detection of point mutations. The locations of the hemizygous ABCD1 insertion/deletion mutations were also easily identified because the signals of the insertion/deletion sites and surrounding probes were undetectable (Figure 3B). Determination of the exact base changes required direct nucleotide sequence analysis. In contrast, none of the 4 heterozygous insertion/deletion mutations of SOD1 were unambiguously detected without prior information on the mutations. Only the SOD1 heterozygous deletion mutation del429TT was detectable by carefully evaluating the signal intensities (Figure 3C) because the signal intensities were moderately decreased at the deletion sites and the 12 flanking bases (Figure 3D).
By employing robotics to manipulate numerous PCR reactions, the resequencing of as many as 271 625 bp was easily accomplished in 3 working days with a total of 271 445 bp (99.93%) correctly called, confirming the high throughput of this system.
Comprehensive resequencing of genes relevant to als
The molecular diagnosis of 10 patients with familial ALS using this system revealed 2 SOD1 mutations, including 1 novel mutation, K3E (Figure 4A), and 1 previously identified mutation, I106V. The novel mutation was not identified in the 238 controls (476 chromosomes). The novel SOD1 mutation was found in a 70-year-old man presenting with progressive distal-dominant muscle atrophy, weakness in all extremities, and positive Babinski signs.
In the 35 patients with sporadic ALS, we identified a previously known SOD1 mutation, S134N, and a novel putative pathogenic DCTN1 (OMIM 601143) mutation, R997W (Figure 4B). These mutations were not present in the 238 controls (476 chromosomes). The amino acid residue R997 of DCTN1 was located in a region conserved among different animal species (Figure 4C). The patient with the DCTN1 mutation was a 68-year-old man presenting with progressive muscle atrophy, weakness in all of his extremities, and postural tremor in the upper extremities, with onset at the age of 67 years. Findings from neurological examination on admission at 68 years of age revealed diffuse muscle atrophy, weakness, fasciculation, and hyporeflexia in all extremities. Weakness of neck flexion was also noted. Observation of his intelligence was normal. Neither bulbar sign nor pyramidal sign was recognized. Electromyography showed diffuse active neurogenic changes compatible with progressive lower motor neuron degeneration. His parents remained healthy beyond 80 years of age.
The comprehensive analysis of the 35 patients with sporadic ALS also revealed 31 sequence alterations in addition to the 2 mutations described above (Table 2 and Table 3). Nine of the 31 variants (29%) were novel (Figure 5A), including 4 nonsynonymous heterozygous variants consisting of 2 in ALS2 (OMIM 606352), 1 in ANG, and 1 in VEGF (OMIM 192240) (Figure 5B) that were present in the ALS patients but not in 238 controls (476 chromosomes).
The effect of the microarray-based high-throughput resequencing system is 3-fold. First, it enables comprehensive mutational analyses of multiple causative genes for the diagnosis of familial cases. Because of nonallelic genetic heterogeneities and clinical variability, it is often difficult to focus on particular genes depending solely on the phenotypes. In this situation, the comprehensive analysis of causative genes is often superior to categorical approaches based on clinical information. The second effect is the identification of mutations in causative genes in sporadic cases (Figure 4). Thus, comprehensive resequencing of the causative genes may reveal mutations with reduced penetrance or de novo mutations in a portion of patients with sporadic ALS. The system has a great advantage in screening numerous genes in many patients with sporadic ALS.
The third effect is the discovery of rare variants potentially involved in disease susceptibility. The current approaches for identifying genetic risks of ALS are mainly based on GWAS employing common single-nucleotide polymorphisms, which generally provide relatively low odds ratios.14,15 The extensive resequencing of relevant genes is expected to complement GWAS by identifying rare variants that contribute to the development of diseases with substantially high odds ratios.16-19 Large-scale resequencing projects to uncover functional and regulatory variants are currently in progress, identifying numerous novel variants.20 Indeed, nonsynonymous heterozygous variants in ALS2, ANG, and VEGF are overrepresented in patients with ALS (Figure 5B). To confirm the significance of these rare variants in disease pathogenesis, large-scale case-control studies and functional analyses of individual mutant proteins will be required.
The advantage of SBH lies in resequencing particular sets of genes. Once the microarrays are designed, the sequencing is inexpensive and the system can be efficiently used for the repetitive interrogation of the same genome region. To further enhance the throughput of the resequencing system based on SBH, improvement in the detection capability for heterozygous insertion/deletion mutations is required. It seems theoretically possible to overcome this issue by optimizing hybridization conditions and detecting changes in the signal intensity patterns.21
The DNA microarray–based high-throughput resequencing system for comprehensive analysis of causative and disease-related genes contributes to the identification of causative mutations not only in familial ALS cases but also in some sporadic cases with low-penetrant mutations or de novo mutations, and to the identification of numerous rare variations potentially associated with diseases. This system serves as a milestone for translating the technological innovation of high-throughput resequencing directly into clinical practice.
Correspondence: Shoji Tsuji, MD, PhD, Department of Neurology, Graduate School of Medicine, University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113-8655, Japan (firstname.lastname@example.org).
Accepted for Publication: March 21, 2008.
Author Contributions: Dr Tsuji had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: Murayama, Itoyama, Goto, and Tsuji. Acquisition of data: Takahashi, Seki, Matsukawa, Kishino, Aoki, Shimozawa, Murayama, Suzuki, Sobue, Nishizawa, and Goto. Analysis and interpretation of data: Takahashi, Ishiura, Mitsui, Goto, and Tsuji. Drafting of the manuscript: Takahashi, Seki, Matsukawa, Kishino, Aoki, Itoyama, Suzuki, Sobue, Nishizawa, Goto, and Tsuji. Critical revision of the manuscript for important intellectual content: Takahashi, Ishiura, Mitsui, Onodera, Shimozawa, Murayama, Goto, and Tsuji. Obtained funding: Tsuji. Administrative, technical, and material support: Takahashi, Seki, Ishiura, Kishino, Aoki, Shimozawa, Murayama, Suzuki, Sobue, Nishizawa, and Tsuji. Study supervision: Itoyama, Goto, and Tsuji.
Financial Disclosure: None reported.
Funding/Support: This study was supported in part by KAKENHI (Grant-in-Aid for Scientific Research) on Priority Areas; Applied Genomics, the 21st Century Center of Excellence Program, Center for Integrated Brain Medical Science, and Scientific Research (A) from the Ministry of Education, Culture, Sports, Science, and Technology of Japan; a Grant-in-Aid for the Research Committee for Ataxic Diseases of the Research on Measures for Intractable Diseases from the Ministry of Health, Labour, and Welfare, Japan; and a grant from the Takeda Foundation.
SR Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. Am J Hum Genet
727- 739PubMedGoogle ScholarCrossref
et al. High-throughput variation detection and genotyping using microarrays. Genome Res
1913- 1925PubMedGoogle Scholar
et al. New developments in high-throughput resequencing and variation detection using high density microarrays. Hum Mutat
402- 409PubMedGoogle ScholarCrossref
et al. alpha-Synuclein promoter RsaI T-to-C polymorphism and the risk of Parkinson's disease. J Neural Transm
1425- 1433PubMedGoogle ScholarCrossref
et al. Multiple regions of alpha-synuclein are associated with Parkinson's disease. Ann Neurol
535- 541PubMedGoogle ScholarCrossref
DR Mutations in Cu/Zn superoxide dismutase gene are associated with familial amyotrophic lateral sclerosis. Nature
et al. A two basepair deletion in the SOD1
gene causes familial amyotrophic lateral sclerosis. Hum Mol Genet
2061- 2062PubMedGoogle Scholar
et al. A missense mutation in the SOD1
gene in patients with amyotrophic lateral sclerosis from the Kii Peninsula and its vicinity, Japan. Neurogenetics
113- 115PubMedGoogle ScholarCrossref
et al. Marked reduction of the Cu/Zn superoxide dismutase polypeptide in a case of familial amyotrophic lateral sclerosis with the homozygous mutation. Neurosci Lett
165- 168PubMedGoogle ScholarCrossref
et al. Putative X-linked adrenoleukodystrophy gene shares unexpected homology with ABC transporters. Nature
726- 730PubMedGoogle ScholarCrossref
et al. Prenatal diagnosis of adrenoleukodystrophy by means of mutation analysis. Prenat Diagn
259- 261PubMedGoogle ScholarCrossref
et al. Two novel missense mutations in the ATP-binding domain of the adrenoleukodystrophy gene: immunoblotting and immunocytological study of two patients. Clin Genet
322- 325PubMedGoogle ScholarCrossref
S Mutational analysis and genotype-phenotype correlation of 29 unrelated Japanese patients with x-linked adrenoleukodystrophy. Arch Neurol
295- 300PubMedGoogle ScholarCrossref
et al. Genome-wide genotyping in amyotrophic lateral sclerosis and neurologically normal controls: first stage analysis and public release of data. Lancet Neurol
322- 328PubMedGoogle ScholarCrossref
et al. Genetic variation in DPP6 is associated with susceptibility to amyotrophic lateral sclerosis. Nat Genet
29- 31PubMedGoogle ScholarCrossref
DH Technology insight: querying the genome with microarrays—progress and hope for neurological disease. Nat Clin Pract Neurol
147- 158PubMedGoogle ScholarCrossref
AP Resequencing of the characterised CTGF gene to identify novel or known variants, and analysis of their association with diabetic nephropathy. J Hum Genet
383- 386PubMedGoogle ScholarCrossref
et al. Resequencing the G6PT1 gene reveals a novel splicing mutation in a patient with glycogen storage disease type 1b. Clin Chim Acta
147- 148PubMedGoogle ScholarCrossref
JG Comparisons of substitution, insertion and deletion probes for resequencing and mutational analysis using oligonucleotide microarrays. Nucleic Acids Res