The ACA haplotype is associated with 72% of the expanded repeats in Machado-Joseph disease (MJD) worldwide and has been traced to a Portuguese ancestry. It is present in only 5% of the normal chromosomes in the Portuguese population.
To trace the origin of expanded alleles of MJD in the Indian population.
We performed CAG repeat size determination and haplotype analysis for 9 families with MJD and 263 unrelated chromosomes with unexpanded CAG sequences from the Indian population.
All the expanded alleles were exclusively associated with the ACA haplotype in the Indian population. Interestingly, this haplotype was very common in normal alleles (40%) as compared with the Portuguese population (5%) and was significantly associated with large normal alleles (Pearson χ21 = 87.1, P<.001) in the Indian population. We also observed a rare intermediate allele of MJD with the ACA haplotype but with a CAG variant instead of CAA at the sixth position in the repeat tract.
Overrepresentation of the ACA haplotype in large normal alleles in India as compared with the Portuguese population suggests that the expansion-prone large normal alleles with the ACA haplotype may have been introduced in the Portuguese population through admixture with South Asian populations. Detailed haplotype analysis of a CAG variant within the repeat tract in an intermediate allele of MJD suggests a mechanism of gene conversion in the expansion.
Machado-Joseph disease (MJD)/spinocerebellar ataxia type 3 (SCA3) (MIM 109150) is the most common autosomal dominant cerebellar ataxia caused by CAG repeat expansion.1 Normal individuals have 12 to 40 CAG repeats, which expand to 61 to 86 repeats in affected individuals.1-3 There is a wide gap between the repeat lengths of the normal and expanded alleles. Intermediate alleles of MJD in the range of 46 to 56 CAG repeats have been reported in only 3 rare cases.4-6
Worldwide, 72% of families with MJD have an ACA haplotype that has been traced to Portuguese-Azorean ancestry.7 This haplotype, however, is not significantly associated with large normal alleles at the MJD locus in the Portuguese population (P = .30).8 The role of various genetic factors, such as repeat length and haplotype background, in predisposing the repeats toward instability as well as the mechanism involved in large jumps from normal to expanded alleles at the MJD locus is still not well established.
In the present study, we analyzed intragenic single nucleotide polymorphisms (SNPs) in the Indian population to identify an association between haplotypes and repeat length at the MJD locus. All the affected chromosomes had the ACA haplotype, which was significantly associated with large normal alleles (Pearson χ21 = 87.1, P<.001), suggesting that the expanded chromosomes arose from a pool of large normal alleles. We propose the involvement of gene conversion as a possible mechanism in the origin of a rare intermediate allele at the MJD locus. We also provide historical evidence for the possible introduction of this haplotype in the Portuguese population through admixture with South Asian populations.
This study was undertaken in 9 families with MJD, 83 unrelated normal individuals, and 24 normal families of Indian origin. Informed consent was obtained from all subjects before blood samples were collected. Patients were diagnosed for ataxia at the Neuroscience Centre, All India Institute of Medical Sciences, New Delhi. Genomic DNA was isolated from peripheral blood leukocytes using the salting-out procedure.9
Estimation of the cag repeat length
Repeat sizes were estimated at the MJD/SCA3 locus by polymerase chain reaction with primers MJD52 and MJD701 using GeneScan software (Applied Biosystems, Foster City, Calif) on an ABI Prism 377 automated DNA sequencer (Applied Biosystems).
Snp genotyping and haplotype analysis
Haplotype analysis at the MJD locus was carried out using 3 closely linked intragenic SNPs, A669TG/G669TG, C987GG/G987GG, and TAA1118/TAC1118 (Figure 1).7,8 The first SNP was typed by sequencing, and the other two were typed using amplification refractory mutation system polymerase chain reaction (ARMS-PCR)8 followed by sizing the allele-specific fragments on the ABI Prism 377 automated DNA sequencer. In the case of A669TG/G669TG heterozygotes, for which phase could not be determined, the polymorphism was typed through pedigree studies. The CAA/CAGG polymorphism at the sixth triplet of the repeat tract (Figure 1) was typed in 88 individuals by sequencing using the primers MJD52 and MJD70 on an ABI Prism 3100 automated genetic analyzer (Applied Biosystems). We used Lewontin’s normalized measure of linkage disequilibrium (D"), D" = D/Dmax, where D = pAB − pApB, Dmax = min(pApb, papB), p denotes frequency, and A and a and B and b are the two alternate alleles of each SNP.
We found only 9 families with MJD in a total of 335 families with ataxia. Analysis of 263 unrelated chromosomes from the Indian population revealed 16 different alleles ranging from 14 to 37 CAG repeats; the repeat distribution was trimodal, with distinct peaks at 14, 23, and 27.10
Cag repeat and intragenic polymorphisms
All 9 affected families were exclusively associated with the ACA haplotype. Analysis of 3 intragenic SNPs (Figure 1) in the normal alleles showed a significant association (Pearson χ21 = 67.7, 114.4, and 73.0; P<.001) with respect to repeat length. The ACA and GGC alleles of the 3 SNPs were in linkage disequilibrium (D" > 0.9) with respect to each other. The ACA and GGC haplotypes account for more than 80% of the haplotypes observed in the Indian population (Figure 1), with an extremely skewed distribution between the larger and smaller normal alleles. Alleles with repeat sizes smaller than 26 were significantly associated with the GGC haplotype (Pearson χ21 = 42.1, P<.001), whereas large normal alleles with a repeat size of 26 or more were significantly associated with the ACA haplotype (Pearson χ21 = 87.1, P<.001).
Association of haplotypes with cag repeat substructure
More than 94% (144/153) of the normal chromosomes had a CAA allele at the sixth position, while 6% (9/153) had a CAG allele; the CAG allele is prominently associated with 19 repeats (Figure 2), very similar to the frequency reported earlier.10,11 This CAG variant was absent in the large normal alleles and had a flanking non-ACA haplotype.
Interestingly, of the 9 affected chromosomes, only the intermediate allele (45 repeats) had the CAG variant, while the remaining 8 expanded alleles had the CAA variant.
Low prevalence of mjd in the indian population
The prevalence of MJD in India (<3%) is much lower than that reported in other Asian populations (Korea, 29%12; Japan, 43%13) and the Portuguese population (84%).14 The frequency of large normal alleles with more than 31 CAG repeats (1%) was significantly lower in the present study than that reported in a comparable study in the Japanese population,13 which might explain the lower prevalence of MJD in India.
Origin of expanded chromosomes from large normal alleles
The occurrence of the ACA haplotype in 40% of the normal chromosomes in the present study, the highest among studied populations, indicates the antiquity of this haplotype in the Indian population. Moreover, the association of the ACA haplotype with large normal alleles suggests that the expanded alleles originated from the pool of large normal alleles.
Mechanism of origin of the intermediate allele
Although the intermediate allele had the same flanking ACA haplotype, it had a CAG variant at the sixth position that was not present in the large normal and expanded alleles. This suggests that the intermediate allele might have arisen through an interallelic gene conversion event, with an ACA allele with a CAA variant at the sixth position as the recipient and a smaller allele with a CAG variant acting as the donor (all CAG variants are present in alleles with fewer than 26 repeats) (Figure 3). Since this event would involve a smaller and a larger normal allele, an intermediate length could be the result. By an analogous mechanism, large expanded alleles could be generated by an event involving two ACA alleles, but this needs to be validated in other populations. The possibility of recombination cannot be determined, as the flanking haplotype of the donor and the recipient is same.
Because the mechanism of expansion involves larger normal alleles, the haplotype associated with the expanded allele might depend on the frequency of the haplotypes associated with large normal alleles in a given population. This might also explain why different haplotypes are associated with expanded alleles in other populations (Table>).8
Introduction of the founder haplotype for mjd into the azores
Comparison of 16 non-Portuguese families with MJD from across the world revealed the ACA haplotype in the vast majority, suggesting that this is the ancestral haplotype associated with the affected chromosomes.7 However, the ACA and GGC haplotypes occur with almost equal frequency in expanded chromosomes in the Azores, despite the rarity of the ACA haplotype in the normal population (Table>). In contrast, the ACA haplotype is very common in India, especially in large normal alleles (>26 repeats). This indicates that large normal alleles with the ACA haplotype may have been introduced in the Azores through admixture between the Indian and Portuguese populations, which could have served as founders of the MJD mutation in the Azorean population. This is corroborated by well-documented historical evidence related to the Moorish sea trade and to maritime links between Portugal and South Asia.15 The Portuguese had extensive settlements in India, and though their population was limited, extensive interaction with the local communities was encouraged. By 1815, when the population was formally recorded, there were several families of known mixed descent in Goa and several thousand individuals.16 In addition, slaves were traded from Goa, and a significant number of sailors on Portuguese ships were from the Indian and African coasts. Indian sailors and soldiers were also used extensively in the Portuguese territories further East.15 The maritime importance of the Azores was critical, and soldiers, sailors, and slaves of all regions passed through the islands. Thus, it is reasonable to suggest that a chromosome with the ACA haplotype may have been introduced into Portugal and the Azores by admixture with the South Asian population, and that a subsequent mutation in this haplotype background could have spread worldwide through Portuguese activity. However, this theory needs more careful validation, as maritime links between the Indian subcontinent, Southeast Asia, China, and Africa predate the Portuguese. Data on the ACA haplotype in the normal populations of these regions are not available.
Given the wide expanse of Portuguese movements, it would be worthwhile to compare haplotypes along these maritime routes rather than based on geographical proximity.17 It would also be useful to study the gene structures of these populations to understand the differences in disease and symptoms.
We were not able to determine whether there was one founder for the MJD mutant allele that through drift or a population bottleneck skewed to such a high proportion of the affected alleles in the Portuguese population. It is also possible that there have been recurrent mutations in a predisposed background. In addition, since there are other haplotypes that are also associated with expanded alleles in the Azorean population, the possibilities could coexist. Moreover, since there has been admixture from other populations with the Portuguese,18 this could also explain the presence of multiple founders.
In conclusion, our study suggests that the expansion-prone large normal alleles with the ACA haplotype could have been introduced into the Portuguese population through admixture with South Asian populations. Our analysis also underscores the importance of population history in defining the genomic structure of a locus to understand the molecular mechanism of repeat instability.
Correspondence: Mitali Mukerji, PhD, Institute of Genomics and Integrative Biology, Council for Scientific and Industrial Research, Mall Road, Delhi 110007, India (email@example.com).
Accepted for Publication: May 26, 2004.
Author Contributions:Study concept and design: Mittal, Srivastava, Satish Jain, and Mukerji. Acquisition of data: Mittal, Srivastava, Satish Jain, and Mukerji. Analysis and interpretation of data: Mittal, Satish Jain, Sanjeev Jain, and Mukerji. Drafting of the manuscript: Mittal, Srivastava, Sanjeev Jain, and Mukerji. Critical revision of the manuscript for important intellectual content: Mittal, Srivastava, Satish Jain, Sanjeev Jain, and Mukerji. Administrative, technical, and material support: Sanjeev Jain. Study supervision: Satish Jain, Sanjeev Jain, and Mukerji.
Funding/Support: This work was supported by grants from the Department of Biotechnology, Government of India, to the Program on Functional Genomics, Institute of Genomics and Integrative Biology, New Delhi, and from the Council for Scientific and Industrial Research.
Acknowledgments: We thank Samir K. Brahmachari, PhD, for providing intellectual support during the course of this investigation; Vani Brahmacharifor, PhD, critical evaluation of the manuscript; and Sangeeta Sharma, BSc, for help with the GeneScan and sequence analysis.
et al. CAG expansions in a novel gene for Machado-Joseph disease at chromosome 14q32.1. Nat Genet
1994;8221- 228PubMedGoogle ScholarCrossref
et al. Correlation between CAG repeat length and clinical features in Machado-Joseph disease. Am J Hum Genet
1995;5754- 61PubMedGoogle Scholar
et al. Molecular features of the CAG repeats and clinical manifestation of Machado-Joseph disease. Hum Mol Genet
1995;4807- 812PubMedGoogle ScholarCrossref
M Machado-Joseph disease: cerebellar ataxia and autonomic dysfunction in a patient with the shortest known expanded allele (56 CAG repeat units) of the MJD1 gene. Neurology
1997;49604- 606PubMedGoogle ScholarCrossref
et al. Intermediate CAG repeat lengths (53,54) for MJD/SCA3 are associated with an abnormal phenotype. Ann Neurol
2001;49805- 807PubMedGoogle ScholarCrossref
SK Identification of a novel 45 repeat unstable allele associated with a disease phenotype at the MJD1/SCA3 locus. Am J Med Genet B Neuropsychiatr Genet
2005;133124- 126PubMedGoogle ScholarCrossref
et al. Ancestral origins of the Machado-Joseph disease mutation: a worldwide haplotype study. Am J Hum Genet
2001;68523- 528PubMedGoogle ScholarCrossref
et al. Study of three intragenic polymorphisms in the Machado-Joseph disease gene (MJD1) in relation to genetic instability of the (CAG)n tract. Eur J Hum Genet
1999;7147- 156PubMedGoogle ScholarCrossref
HF A simple salting out procedure for extracting DNA from human nucleated cells. Nucleic Acids Res
et al. Analysis of CAG repeat of the Machado-Joseph gene in human, chimpanzee and monkey populations: a variant nucleotide is associated with the number of CAG repeats. Hum Mol Genet
1996;5207- 213PubMedGoogle ScholarCrossref
et al. Intergenerational instability of the CAG repeat of the gene for Machado-Joseph disease (MJD1) is affected by the genotype of the normal chromosome: implications for the molecular mechanisms of the instability of the CAG repeat. Hum Mol Genet
1996;5923- 932PubMedGoogle ScholarCrossref
et al. Frequency analysis and clinical characterization of spinocerebellar ataxia types 1, 2, 3, 6, and 7 in Korean patients. Arch Neurol
2003;60858- 863PubMedGoogle ScholarCrossref
et al. Close associations between prevalences of dominantly inherited spinocerebellar ataxias with CAG-repeat expansions and frequencies of large normal CAG alleles in Japanese and Caucasian populations. Am J Hum Genet
1998;631060- 1066PubMedGoogle ScholarCrossref
et al. Frequency of spinocerebellar ataxia type 1, dentatorubropallidoluysian atrophy, and Machado-Joseph disease mutations in a large group of spinocerebellar ataxia patients. Neurology
1996;46214- 218PubMedGoogle ScholarCrossref
S The Portuguese Empire in Asia, 1500-1700: A Political and Economic History. London, England: Longman Group Ltd; 1993
A General Statistical and Historical Report on Portuguese India, to Which Is Appended an Account of Contents Suppressed in 1835, Extracted From Official Documents in 1850 by Capt. Joaquim Jose Cicilia K.C; Chief Secretary, Portuguese India. Microfilm I 1253/Zug 1984, Index 590. Located at: British Library, London, England
et al. Genetic structure and origin of peopling in the Azores islands (Portugal): the view from mtDNA. Ann Hum Genet
2003;67433- 456PubMedGoogle ScholarCrossref
et al. Portuguese families with dentatorubropallidoluysian atrophy (DRPLA) share a common haplotype of Asian origin. Eur J Hum Genet
2003;11808- 811PubMedGoogle ScholarCrossref