Association of intragenic haplotypes with CAG repeat lengths at the MJD locus. Top, Schematic representation of relative locations of intragenic polymorphisms and CAG repeats in Machado-Joseph disease complementary DNA. The CAA/CAG variant at the sixth position in the repeat tract is shown. Bottom, Distribution of intragenic haplotypes according to CAG repeat number in a normal population (n = 161) and in families (n = 9) of Indian origin with Machado-Joseph disease.
Distribution of CAA/CAG variant at the sixth position of the CAG repeat stretch with respect to intragenic haplotypes in the normal (n =153), intermediate, and expanded Machado-Joseph disease alleles.
The proposed model for the mechanism of gene conversion involved in the origin of the intermediate allele at the MJD locus.
Mittal U, Srivastava AK, Jain S, Jain S, Mukerji M. Founder Haplotype for Machado-Joseph Disease in the Indian PopulationNovel Insights From History and Polymorphism Studies. Arch Neurol. 2005;62(4):637–640. doi:10.1001/archneur.62.4.637
The ACA haplotype is associated with 72% of the expanded repeats in Machado-Joseph disease (MJD) worldwide and has been traced to a Portuguese ancestry. It is present in only 5% of the normal chromosomes in the Portuguese population.
To trace the origin of expanded alleles of MJD in the Indian population.
We performed CAG repeat size determination and haplotype analysis for 9 families with MJD and 263 unrelated chromosomes with unexpanded CAG sequences from the Indian population.
All the expanded alleles were exclusively associated with the ACA haplotype in the Indian population. Interestingly, this haplotype was very common in normal alleles (40%) as compared with the Portuguese population (5%) and was significantly associated with large normal alleles (Pearson χ21 = 87.1, P<.001) in the Indian population. We also observed a rare intermediate allele of MJD with the ACA haplotype but with a CAG variant instead of CAA at the sixth position in the repeat tract.
Overrepresentation of the ACA haplotype in large normal alleles in India as compared with the Portuguese population suggests that the expansion-prone large normal alleles with the ACA haplotype may have been introduced in the Portuguese population through admixture with South Asian populations. Detailed haplotype analysis of a CAG variant within the repeat tract in an intermediate allele of MJD suggests a mechanism of gene conversion in the expansion.
Machado-Joseph disease (MJD)/spinocerebellar ataxia type 3 (SCA3) (MIM 109150) is the most common autosomal dominant cerebellar ataxia caused by CAG repeat expansion.1 Normal individuals have 12 to 40 CAG repeats, which expand to 61 to 86 repeats in affected individuals.1- 3 There is a wide gap between the repeat lengths of the normal and expanded alleles. Intermediate alleles of MJD in the range of 46 to 56 CAG repeats have been reported in only 3 rare cases.4- 6
Worldwide, 72% of families with MJD have an ACA haplotype that has been traced to Portuguese-Azorean ancestry.7 This haplotype, however, is not significantly associated with large normal alleles at the MJD locus in the Portuguese population (P = .30).8 The role of various genetic factors, such as repeat length and haplotype background, in predisposing the repeats toward instability as well as the mechanism involved in large jumps from normal to expanded alleles at the MJD locus is still not well established.
In the present study, we analyzed intragenic single nucleotide polymorphisms (SNPs) in the Indian population to identify an association between haplotypes and repeat length at the MJD locus. All the affected chromosomes had the ACA haplotype, which was significantly associated with large normal alleles (Pearson χ21 = 87.1, P<.001), suggesting that the expanded chromosomes arose from a pool of large normal alleles. We propose the involvement of gene conversion as a possible mechanism in the origin of a rare intermediate allele at the MJD locus. We also provide historical evidence for the possible introduction of this haplotype in the Portuguese population through admixture with South Asian populations.
This study was undertaken in 9 families with MJD, 83 unrelated normal individuals, and 24 normal families of Indian origin. Informed consent was obtained from all subjects before blood samples were collected. Patients were diagnosed for ataxia at the Neuroscience Centre, All India Institute of Medical Sciences, New Delhi. Genomic DNA was isolated from peripheral blood leukocytes using the salting-out procedure.9
Repeat sizes were estimated at the MJD/SCA3 locus by polymerase chain reaction with primers MJD52 and MJD701 using GeneScan software (Applied Biosystems, Foster City, Calif) on an ABI Prism 377 automated DNA sequencer (Applied Biosystems).
Haplotype analysis at the MJD locus was carried out using 3 closely linked intragenic SNPs, A669TG/G669TG, C987GG/G987GG, and TAA1118/TAC1118 (Figure 1).7,8 The first SNP was typed by sequencing, and the other two were typed using amplification refractory mutation system polymerase chain reaction (ARMS-PCR)8 followed by sizing the allele-specific fragments on the ABI Prism 377 automated DNA sequencer. In the case of A669TG/G669TG heterozygotes, for which phase could not be determined, the polymorphism was typed through pedigree studies. The CAA/CAGG polymorphism at the sixth triplet of the repeat tract (Figure 1) was typed in 88 individuals by sequencing using the primers MJD52 and MJD70 on an ABI Prism 3100 automated genetic analyzer (Applied Biosystems). We used Lewontin’s normalized measure of linkage disequilibrium (D"), D" = D/Dmax, where D = pAB − pApB, Dmax = min(pApb, papB), p denotes frequency, and A and a and B and b are the two alternate alleles of each SNP.
We found only 9 families with MJD in a total of 335 families with ataxia. Analysis of 263 unrelated chromosomes from the Indian population revealed 16 different alleles ranging from 14 to 37 CAG repeats; the repeat distribution was trimodal, with distinct peaks at 14, 23, and 27.10
All 9 affected families were exclusively associated with the ACA haplotype. Analysis of 3 intragenic SNPs (Figure 1) in the normal alleles showed a significant association (Pearson χ21 = 67.7, 114.4, and 73.0; P<.001) with respect to repeat length. The ACA and GGC alleles of the 3 SNPs were in linkage disequilibrium (D" > 0.9) with respect to each other. The ACA and GGC haplotypes account for more than 80% of the haplotypes observed in the Indian population (Figure 1), with an extremely skewed distribution between the larger and smaller normal alleles. Alleles with repeat sizes smaller than 26 were significantly associated with the GGC haplotype (Pearson χ21 = 42.1, P<.001), whereas large normal alleles with a repeat size of 26 or more were significantly associated with the ACA haplotype (Pearson χ21 = 87.1, P<.001).
More than 94% (144/153) of the normal chromosomes had a CAA allele at the sixth position, while 6% (9/153) had a CAG allele; the CAG allele is prominently associated with 19 repeats (Figure 2), very similar to the frequency reported earlier.10,11 This CAG variant was absent in the large normal alleles and had a flanking non-ACA haplotype.
Interestingly, of the 9 affected chromosomes, only the intermediate allele (45 repeats) had the CAG variant, while the remaining 8 expanded alleles had the CAA variant.
The prevalence of MJD in India (<3%) is much lower than that reported in other Asian populations (Korea, 29%12; Japan, 43%13) and the Portuguese population (84%).14 The frequency of large normal alleles with more than 31 CAG repeats (1%) was significantly lower in the present study than that reported in a comparable study in the Japanese population,13 which might explain the lower prevalence of MJD in India.
The occurrence of the ACA haplotype in 40% of the normal chromosomes in the present study, the highest among studied populations, indicates the antiquity of this haplotype in the Indian population. Moreover, the association of the ACA haplotype with large normal alleles suggests that the expanded alleles originated from the pool of large normal alleles.
Although the intermediate allele had the same flanking ACA haplotype, it had a CAG variant at the sixth position that was not present in the large normal and expanded alleles. This suggests that the intermediate allele might have arisen through an interallelic gene conversion event, with an ACA allele with a CAA variant at the sixth position as the recipient and a smaller allele with a CAG variant acting as the donor (all CAG variants are present in alleles with fewer than 26 repeats) (Figure 3). Since this event would involve a smaller and a larger normal allele, an intermediate length could be the result. By an analogous mechanism, large expanded alleles could be generated by an event involving two ACA alleles, but this needs to be validated in other populations. The possibility of recombination cannot be determined, as the flanking haplotype of the donor and the recipient is same.
Because the mechanism of expansion involves larger normal alleles, the haplotype associated with the expanded allele might depend on the frequency of the haplotypes associated with large normal alleles in a given population. This might also explain why different haplotypes are associated with expanded alleles in other populations (Table>).8
Comparison of 16 non-Portuguese families with MJD from across the world revealed the ACA haplotype in the vast majority, suggesting that this is the ancestral haplotype associated with the affected chromosomes.7 However, the ACA and GGC haplotypes occur with almost equal frequency in expanded chromosomes in the Azores, despite the rarity of the ACA haplotype in the normal population (Table>). In contrast, the ACA haplotype is very common in India, especially in large normal alleles (>26 repeats). This indicates that large normal alleles with the ACA haplotype may have been introduced in the Azores through admixture between the Indian and Portuguese populations, which could have served as founders of the MJD mutation in the Azorean population. This is corroborated by well-documented historical evidence related to the Moorish sea trade and to maritime links between Portugal and South Asia.15 The Portuguese had extensive settlements in India, and though their population was limited, extensive interaction with the local communities was encouraged. By 1815, when the population was formally recorded, there were several families of known mixed descent in Goa and several thousand individuals.16 In addition, slaves were traded from Goa, and a significant number of sailors on Portuguese ships were from the Indian and African coasts. Indian sailors and soldiers were also used extensively in the Portuguese territories further East.15 The maritime importance of the Azores was critical, and soldiers, sailors, and slaves of all regions passed through the islands. Thus, it is reasonable to suggest that a chromosome with the ACA haplotype may have been introduced into Portugal and the Azores by admixture with the South Asian population, and that a subsequent mutation in this haplotype background could have spread worldwide through Portuguese activity. However, this theory needs more careful validation, as maritime links between the Indian subcontinent, Southeast Asia, China, and Africa predate the Portuguese. Data on the ACA haplotype in the normal populations of these regions are not available.
Given the wide expanse of Portuguese movements, it would be worthwhile to compare haplotypes along these maritime routes rather than based on geographical proximity.17 It would also be useful to study the gene structures of these populations to understand the differences in disease and symptoms.
We were not able to determine whether there was one founder for the MJD mutant allele that through drift or a population bottleneck skewed to such a high proportion of the affected alleles in the Portuguese population. It is also possible that there have been recurrent mutations in a predisposed background. In addition, since there are other haplotypes that are also associated with expanded alleles in the Azorean population, the possibilities could coexist. Moreover, since there has been admixture from other populations with the Portuguese,18 this could also explain the presence of multiple founders.
In conclusion, our study suggests that the expansion-prone large normal alleles with the ACA haplotype could have been introduced into the Portuguese population through admixture with South Asian populations. Our analysis also underscores the importance of population history in defining the genomic structure of a locus to understand the molecular mechanism of repeat instability.
Correspondence: Mitali Mukerji, PhD, Institute of Genomics and Integrative Biology, Council for Scientific and Industrial Research, Mall Road, Delhi 110007, India (firstname.lastname@example.org).
Accepted for Publication: May 26, 2004.
Author Contributions:Study concept and design: Mittal, Srivastava, Satish Jain, and Mukerji. Acquisition of data: Mittal, Srivastava, Satish Jain, and Mukerji. Analysis and interpretation of data: Mittal, Satish Jain, Sanjeev Jain, and Mukerji. Drafting of the manuscript: Mittal, Srivastava, Sanjeev Jain, and Mukerji. Critical revision of the manuscript for important intellectual content: Mittal, Srivastava, Satish Jain, Sanjeev Jain, and Mukerji. Administrative, technical, and material support: Sanjeev Jain. Study supervision: Satish Jain, Sanjeev Jain, and Mukerji.
Funding/Support: This work was supported by grants from the Department of Biotechnology, Government of India, to the Program on Functional Genomics, Institute of Genomics and Integrative Biology, New Delhi, and from the Council for Scientific and Industrial Research.
Acknowledgments: We thank Samir K. Brahmachari, PhD, for providing intellectual support during the course of this investigation; Vani Brahmacharifor, PhD, critical evaluation of the manuscript; and Sangeeta Sharma, BSc, for help with the GeneScan and sequence analysis.