Results of published studies of the association between the factor V Leiden mutation and ischemic stroke. Odds ratios for the outcome compared carriers of the Gln506 allele vs wild type (Arg/Arg). CI indicates confidence interval. The size of the box is porportional to the weight of the study.
Results of published studies of the association between the methylenetetrahydrofolate reductase C677T polymorphism and ischemic stroke. Odds ratios for the outcome compared individuals homozygous for the T allele (T/T) with those heterozygous individuals (C/T) plus wild type (C/C). CI indicates confidence interval. The size of the box is porportional to the weight of the study.
Results of published studies of the association between the prothrombin G20210A polymorphism and ischemic stroke. Odds ratios for the outcome compared carriers of the A allele with those with wild type (G/G). CI indicates confidence interval. The size of the box is porportional to the weight of the study.
Results of published studies of the association between the ACE I/D polymorphism and ischemic stroke. Odds ratios for the outcome compared individuals homozygous for the D allele with those with the heterozygous (D/I) plus wild type (I/I). CI indicates confidence interval. The size of the box is porportional to the weight of the study.
Results of published studies of the association between the apolipoprotein E polymorphism and ischemic stroke. Odds ratios for the outcome compared carriers of the ε4 allele with those with the ε3 and ε2 alleles. CI indicates confidence interval. The size of the box is porportional to the weight of the study.
Results of published studies of the association between the factor XIII polymorphism and ischemic stroke. Odds ratios for the outcome compared individuals homozygous for the Leu34 allele (Leu/Leu) with those with the heterozygous (Val/Leu) plus wild type (Val/Val). CI indicates confidence interval. The size of the box is porportional to the weight of the study.
Results of published studies of the association between the glycoprotein IIIa polymorphism and ischemic stroke. Odds ratios for the outcome compared individuals homozygous for the Pro allele with those with the heterozygous (Leu/Pro) plus wild type (Leu/Leu). CI indicates confidence interval. The size of the box is porportional to the weight of the study.
Customize your JAMA Network experience by selecting one or more topics from the list below.
Casas JP, Hingorani AD, Bautista LE, Sharma P. Meta-analysis of Genetic Studies in Ischemic Stroke: Thirty-two Genes Involving Approximately 18 000 Cases and 58 000 Controls. Arch Neurol. 2004;61(11):1652–1661. doi:10.1001/archneur.61.11.1652
Ischemic stroke is thought to have a polygenic basis, but identification of stroke susceptibility genes and quantification of associated risks have been hampered by conflicting results from underpowered case-control studies. We performed a meta-analysis of all candidate gene association studies in ischemic stroke. Electronic databases were searched up until January 2003 for all case-control and nested case–control studies in English-language journals relating to the investigation of any candidate gene for ischemic stroke in humans. Cases were required to have neuroimaging evidence of the diagnosis. To maintain genetic homogeneity, only studies in white adults were included. Studies that evaluated quantitative traits or intermediate phenotypes were excluded. Data from 120 case-control studies were included. Pooled odds ratios (ORs) with 95% confidence intervals (CIs) from random- and fixed-effects models were calculated. Of 32 genes studied, 15 polymorphisms were identified for which at least 3 studies had been conducted. Statistically significant associations with ischemic stroke were identified for factor V Leiden Arg506Gln (OR, 1.33; 95% CI, 1.12-1.58), methylenetetrahydrofolate reductase C677T (OR, 1.24; 95% CI, 1.08-1.42), prothrombin G20210A (OR, 1.44; 95% CI, 1.11-1.86), and angiotensin-converting enzyme insertion/deletion (OR, 1.21; 95% CI, 1.08-1.35). These were also the most investigated candidate genes, including 4588, 3387, 3028, and 2990 cases, respectively. No statistically significant association with ischemic stroke was detected for the 3 next most investigated genes (factor XIII, apolipoprotein E, and human platelet antigen type 1). There is a genetic component to common stroke. No single gene with major effect was identified; rather, common variants in several genes, each exerting a modest effect, contribute to the risk of stroke. These findings have important implications for the design of future genetic studies and for predictive genetic testing for stroke and other multifactorial diseases.
According to the World Health Organization, stroke is the third most common cause of death in developed countries.1 In the United States there are more than 700 000 incident strokes annually and 4.4 million stroke survivors every year.2 The economic burden of stroke has been estimated to be $51.2 billion annually.3 Because treatments for stroke are limited, the best approach to reducing the burden of disease is primary prevention through modification of acquired risk factors (diabetes mellitus, smoking, high blood pressure, and atrial fibrillation),4 particularly in persons at elevated risk. Stroke cases cluster in families,5 and there is a nearly 5-fold difference in stroke prevalence among monozygotic vs dizygotic twins.6 Epidemiologic studies suggest a polygenic basis for stroke,7-9 and the favored model for the pathogenesis of stroke is an interaction between genetic and acquired risk factors.10
In theory, identification of stroke susceptibility genes might enhance prediction of disease risk. However, the lack of reproducibility of genetic case-control studies has led to uncertainty about the nature and number of genes contributing to stroke risk. There is concern, on one hand, that positive associations might be spurious and, on the other hand, that the negative findings from some studies might be a consequence of inadequate statistical power.
With a case-control design, sample sizes of thousands are required to have adequate power to detect genes of small to moderate effect whose allele frequencies range from 5% to 10%. Few individual studies conducted to date have been of this size. By using all available published data to increase statistical power, meta-analysis might allow plausible candidate genes to be excluded, causative genes to be identified with reliability, and genetic risks to be quantified with more precision. Therefore, we undertook a comprehensive meta-analysis of all genetic case-control studies in ischemic stroke to date.
Electronic databases (MEDLINE, EMBASE, and BIDS [Bath Information and Data Services]) were searched up until January 2003 for all case-control studies evaluating any candidate gene and stroke in humans. Letters and abstracts were included in the meta-analysis. The Medical Subject Headings terms and text words used for the search were cerebrovascular disease, stroke, brain infarction, and cerebral ischemia in combination with genetic, polymorphism(s), mutation,genotype, or genes. The search results were limited to human. All languages were searched initially, but only English-language articles were selected. The references of all computer-identified publications were searched for any additional studies, and the MEDLINE option related articles was used for all the relevant articles. In addition, a search to identify previous genetic meta-analyses in stroke was also performed.
Studies were selected if neuroimaging (magnetic resonance imaging or computed tomography) had been used to confirm the diagnosis of ischemic stroke, and, to maintain homogeneity of genetic background, only studies of white patients were included. Studies were excluded if (1) the patients were children (aged <18 years), (2) quantitative traits or intermediate phenotypes were being investigated, or (3) genotype frequency was not reported. For duplicate publications, the smaller data set was discarded.
The primary search generated 155 potentially relevant articles, of which 120 met the inclusion criteria. Data for analysis were extracted independently and entered into separate databases by 2 of us (J.P.C. and P.S.). The results were compared, and disagreements were resolved by consensus.
Data were analyzed using software for preparing and maintaining Cochrane reviews (Review Manager, version 4.1; Cochrane Collaboration, Syracuse, NY) and statistical analysis software (Stata 8.0; Stata Corp, College Station, Tex). For each genetic marker (polymorphism) for which data were available for at least 3 studies, a meta-analysis was carried out. For each gene variant, a pooled odds ratio (OR) was calculated using fixed- and random-effects models, along with the 95% confidence interval (CI) to measure the strength ofthe genetic association. Fixed-effects summary ORs were calculated using the Mantel-Haenszel method,11,12 and the DerSimonian and Laird method was used to calculate random-effects summary ORs.13
Tests for heterogeneity were performed for each meta-analysis (with significance set at P < .05).14 For assessment of publication bias, we used the funnel plot and the Egger regression asymmetry test.15 In addition, the effect of individual studies on the summary OR was evaluated by reestimating and plotting the summary OR in the absence of each study.
The proportion of stroke cases in the population that could be attributed to a particular genetic variant (population-attributable risk [PAR]) was estimated as follows:
PAR = 100 × [Prevalence (OR − 1)/Prevalence (OR − 1) + 1].
For this calculation, we used the fixed-effects model, and we estimated the prevalence of exposure as the genotype frequency among control subjects.
One hundred twenty candidate gene case-control studies in which the presence or absence of stroke was analyzed in a dichotomous manner were identified. In total, 51 polymorphisms in 32 genes were identified. Of these, data were available from at least 3 studies for 15 polymorphisms in 13 genes. For another 6 polymorphisms, 2 studies per genetic marker were identified, and, in the case of 30 polymorphisms, only 1 study per genetic marker was identified that met the selection criteria. From the 15 polymorphisms analyzed in detail (representing 18 123 cases and 57 579 controls), the mean number of studies per candidate gene was 9 (95% CI, 4.3-12.8). Eight (53%) of the 15 meta-analyses had more than 1000 cases, and 7 (47%) had at least 1 study with a total sample size greater than 1000 (Table).
The Table shows the genotypic ORs for the 15 polymorphisms evaluated. Of those candidate genes with statistically significant associations, the summary ORs varied from 1.21 (95% CI, 1.08-1.35) for angiotensin-converting enzyme (ACE) insertion/deletion (I/D) polymorphism to 1.88 (95% CI, 1.28-2.76) for the polymorphism of the glycoprotein Ib-α (GPIBA) Kozak sequence (Table).
The factor V Leiden mutation has been by far the most investigated, with 26 studies16-41 that included 4588 cases and 13 798 controls. Carriers of the factor V Gln506 allele were 1.33 times more likely to develop stroke (95% CI, 1.12-1.58; P = .001) (Table and Figure 1). However, significant interstudy OR heterogeneity was observed (χ2 = 39.78; P for heterogeneity [PHet] = .03). A sensitivity analysis revealed that the study by Margaglione et al30 was mainly responsible for the heterogeneity observed. After excluding this study from the analysis, the heterogeneity was no longer significant (χ2 = 18.87; PHet = .76), but the OR was attenuated and of marginal significance (OR, 1.18; 95% CI, 0.98-1.42; P = .08). Nevertheless, a random-effects model that takes into account the intrastudy and interstudy variability resulted in a similar overall estimate (OR, 1.31; 95% CI, 1.02-1.68; P = .03), although the 95% CIs are wide, leading to some uncertainty about the size of the effect. The distribution of the OR in relation to its standard deviation in the funnel plot was symmetrical, and the Egger test result was not significant (P = .89), suggesting a low probability of publication bias.
A total of 22 studies20,28-30,34,38,40,42-56 (3387 cases and 4597 controls) were identified that evaluated the polymorphism in the gene encoding methylenetetrahydrofolate reductase where cytosine is replaced by thymidine at base position 677 of the gene (MTHRF C677T). A summary OR, under the fixed-effects model, of 1.24 (95% CI, 1.08-1.42; P = .002) was observed for individuals homozygous for the T allele compared with C allele carriers (C/T plus C/C) (Figure 2). The funnel plot distribution was symmetrical and the Egger test was not significant (P = .08), indicating a low probability of publication bias. No significant interstudy heterogeneity was observed (χ2 = 25.64; PHet = .22), and no individual study had an undue effect on the summary OR.
The prothrombin G20210A mutation was evaluated in 19 studies,17,20,22,23,26,28-30,32,34,38,40,57-63 with a total of 3028 cases and 7131 controls. The summary OR under a fixed-effects model showed that carriers of the mutation were 1.44 times more likely to develop stroke (95% CI, 1.11-1.86; P = .006) (Figure 3). No significant interstudy heterogeneity was observed (χ2 = 10.59; PHet = .91). The distribution of theORs from individual studies in relation to their respective standard deviations (funnel plot) was symmetrical, and the Egger test result suggested a low probability of publication bias (P = .13). Again, no individual study had an undue effect on the summary OR.
The ACE I/D polymorphism was evaluated in 11 studies38,64-73 (2990 cases and 11 305 controls), and a summary OR of 1.21 (95% CI, 1.08-1.35; P<.001), under a fixed-effects model, was observed for individuals homozygous for the D allele compared with heterozygous (D/I) and homozygous (I/I) individuals combined (Figure 4). The funnel plot showed a symmetrical distribution of the OR in relation to its standard deviation, and the Egger test result did not suggest the presence of publication bias (P = .22). No significant interstudy heterogeneity was observed (χ2 = 9.71; PHet = .47), and as for MTHFR C677T and prothrombin G20210A, no individual study had an undue effect on the summary OR.
The PARs for the 4 positive and most investigated candidates—ACE I/D, MTHFR C677T, factor V Leiden, and prothrombin G20210A—following the models of inheritance shown in the Table were 4.54%, 3.31%, 2.16%, and 1.30%, respectively.
Other genetic markers associated with an increase in the risk of stroke but for which the data set was much smaller were glycoprotein Ib-α Thr→Met or human platelet antigen (HPA) type 2 (HPA2) (4 studies88,91,102,104 with 564 cases; OR, 1.55; 95% CI, 1.14-2.11; P = .006), and plasminogen activator inhibitor 1 (PAI1) promoter 4G/5G I/D (4 studies74,99-101 with 842 cases; OR, 1.47; 95% CI, 1.13-1.92; P = .004), with no evidence for heterogeneity in either meta-analysis. Meta-analysis of studies of GPIBA Kozak sequence was positive (3 studies102,107,108 with 350 cases; OR, 1.88; 95% CI, 1.28-2.76; P = .001) (Table), but studies were highly heterogeneous.
Of the remaining 8 polymorphisms studied, no significant associations were observed for 3 genes with large data sets: apolipoprotein E ε4, ε3, ε2 (10 studies50,56,80-87 and 1805 cases; OR, 0.96; 95% CI, 0.84-1.11; P = .60) (Figure 5), factor XIII Val→Leu (6 studies74-79 with 2166 cases; OR, 0.97; 95% CI, 0.75-1.25; P = .80) (Figure 6), and glycoprotein IIIa Leu33Pro or HPA1 (9 studies23,88-95 with 1467 cases; OR, 1.11; 95% CI, 0.95-1.28; P = .20) (Figure 7) polymorphisms (Table). Five of the remaining negative meta-analyses each had a small sample size (endothelial nitric oxide synthase [eNOS] Glu298Asp, 1086 cases; GPIBA variable number tandem repeat [VNTR], 816 cases; glycoprotein IIb Ile→Ser, 770 cases; factor VII A1/A2, 545 cases; and lipoprotein lipase [LPL] Asn291Ser, 452 cases). Overall, these 8 negative meta-analyses included fewer cases than the 7 meta-analyses in which significant associations were detected (mean number of cases: 1138 [95% CI, 621-1655] vs 2250 [95% CI, 723-3776]; P value for difference = .046).
In this comprehensive meta-analysis, 7 (47%) of the 15 candidate polymorphisms analyzed significantly increased the risk of stroke among individuals of European ancestry. In 4 of these meta-analyses (ACE I/D, factor V Leiden, MTHFR C677T, and prothrombin G20210A), the mean number of cases included per gene was more than 3000, allowing more precise estimates to be made of the effect of these genes than from any single study. However, the individual risk provided by any one of these candidate genes was moderate (OR, 1.21-1.44). This is in agreement with previous studies112-114 in other complex diseases, such as ischemic heart disease.
Most candidate genes assessed in stroke thus far have been evaluated initially for their potential role in ischemic heart disease. Therefore, up to now, most genetic studies (73% of the meta-analyses described in this article) have focused on genes involved in thrombosis and coagulation, whereas genes regulating other well-established risk factors for stroke (eg, hypertension, diabetes mellitus, and hyperlipidemia) have received relatively limited attention. Thus, it is possible that several additional genes with similar risks may exist but have yet to be evaluated.
For the genes with positive associations and large data sets, mechanistic studies have indicated the processes by which risk alleles might alter the expression or activity of the encoded protein and contribute to disease pathogenesis. The factor V Leiden mutation causes activated protein C resistance.115 Activated protein C limits clot formation by proteolytic inactivation of factors Va and VIIIa, and the single point mutation in the gene for factor V (1691G→A) studied predicts replacement of arginine by glycine at position 506 in the activated protein C cleavage site. After activation, the mutated factor V is less efficiently degraded by activated protein C than normal factor V, resulting in increased thrombin generation and a hypercoagulable state, which may explain the increased risk of stroke in carriers of this mutation observed in this study.116 A sequence variation in the 3′-untranslated region of the prothrombin gene (G20210A), which alters messenger RNA stability, is associated with elevated prothrombin levels117,118 and thrombin formation117 and may similarly lead to a procoagulant state.
Plasma and intracellular levels of ACE have been shown to be partly determined by the presence of the ACE I/D polymorphism in healthy individuals and in patients with stroke.119,120 Individuals homozygous for the D allele have a 56% increase in ACE activity compared with I allele homozygotes.121 Angiotensin-converting enzyme converts angiotensin I to angiotensin II, which is known to be involved in vascular hypertrophy, vasoconstriction, and atherosclerotic processes.122 Also, ACE is responsible for degradation of bradykinin, a vasoactive peptide that has been suggested to stimulate vasodilator nitric oxide production.122
Long-term differences of 5 μmol/L in the serum concentration of homocysteine are associated with a 59% increase in the risk of stroke.112 The C677T mutation in the MTHFR gene, which encodes an amino acid substitution (A222V), renders the enzyme thermolabile and reduces metabolism of homocysteine.123 A recent meta-analysis124 in coronary heart disease showed that, on average, patients homozygous for the T allele had a 2.2-μmol/L higher serum level of homocysteine than patients with the C/C genotype and have a 1.16-fold increased risk of developing coronary heart disease. Findings from the present meta-analysis suggest that this variant is associated with a similar increase in the risk of stroke.
Taken together, the evidence from these meta-analyses, the molecular studies, and the effects of these genes on other cardiovascular phenotypes supports a role for variants in factor V, prothrombin, MTHFR, and ACE genes in susceptibility to stroke, but verification will be required from larger studies.
The PARs for these polymorphisms ranged from 1.30% for prothrombin G20210A to 4.54% for ACE I/D, values that are far lower than those reported for well-established acquired risk factors for ischemic stroke (eg, hypertension, smoking, and diabetes mellitus).4 This low level of PAR is not surprising, because the genetic contribution of any single gene toward a complex disease is unlikely to act in a simple mendelian fashion but rather with epistatic (gene-gene or gene-environmental interaction) effects. Nevertheless, given the high incidence of stroke, if these estimates are correct, they suggest that variants in 4 common genes may contribute to 9000 to 32 000 strokes in the United States each year.
Meta-analyses of 3 gene variants—apolipoprotein E ε4, ε3, ε2 (1805 cases), factor XIII Val→Leu (2166 cases), and glycoprotein IIIa Leu33Pro (1467 cases)—has so far failed to provide evidence of increased stroke susceptibility. The sample sizes of these meta-analyses allowed exclusion of ORs as low as 1.14, 1.20, and 1.35, respectively, with 80% power at P = .05. It seems unlikely, therefore, that carriers of the apolipoprotein E ε4 allele, which affects serum cholesterol, and which has been associated with a moderate increase in the risk of coronary heart disease,114 are at a substantially higher risk of stroke. Of the remaining 8 meta-analyses with relatively small data sets, 3 (PAI14G/5G [842 cases], HPA2 Thr→Met [564 cases], and GPIBA Kozak sequence [350 cases]) identified significant associations. However, additional larger studies are required to confirm or refute these findings.
The interpretation of any meta-analysis must be made within the context of its limitations, including study selection, publication bias, and variability in the methodological quality of the included studies. The present meta-analyses were restricted to studies published in the English language, but our overall computer search identified only a few non-English studies.125-129 Although publication bias cannot be excluded, this is an unlikely explanation for our findings. Many of the individual studies included in our meta-analysis were not statistically significant and were interpreted by their authors as negative studies. In addition, the Egger asymmetry test and the funnel plot showed no substantialevidence of publication bias in the 7 largest (N>3000) meta-analyses (ACE I/D; factor V Leiden; MTHFR C677T; prothrombin G20210A; apolipoprotein E ε4, ε3, ε2; factor XIII Val→Leu; and glycoprotein IIIa Leu33Pro). Moreover, rigorous selection criteria (neuroimaging and ethnic homogeneity) enriched the meta-analyses for studies with comparable selection of participants. Thus, lack of specificity, by the inclusion of individuals with hemorrhagic stroke or those with a clinical diagnosis of stroke but without neuroimaging evidence, was avoided.
Although it is not possible to exclude the future identification of 1 or more genes with a more substantial effect on stroke risk, our findings suggest that several genes, each with a small to moderate effect, are likely to act individually, together, or in combination with environmental determinants to cause stroke. One implication of these findings is that predictive genetic tests that use any single variant are unlikely individually to have much value. However, tests that combine genotyping for 1 or more risk alleles and that integrate the results with established risk prediction tools based on acquired risk factors (eg, the Framingham risk equation) may have greater utility.130 Another important consequence for future research is that very large case-control studies with several thousand participants will be required to detect new risk alleles with small to moderate effects of the size identified in our review, and to confirm or refute our findings. Because recruitment of data sets of this size may be difficult for a single medical center, a complementary approach has been suggested that involves the recruitment and genotyping of fewer patients and controls from many centers according to uniform criteria and submission of the data (whether nominally positive or negative) to a common Web-based repository for online, continuously updated and cumulative meta-analysis,131 which reduces the potential for publication bias.
In summary, our study confirms the existence of a genetic cause for common stroke but with no single common “stroke gene” exerting a major effect. Instead, several stroke susceptibility alleles are likely to act individually, together, or in combination with environmental determinants to cause stroke.
Correspondence: Pankaj Sharma, MD, PhD, Hammersmith Hospitals Acute Stroke Unit, Imperial College, Fulham Palace Road, London W6 8RF, England (email@example.com).
Accepted for Publication: March 23, 2004.
Author Contributions:Study concept and design: Casas, Hingorani, and Sharma. Acquisition of data: Casas, Hingorani, and Sharma. Analysis and interpretation of data: Casas, Hingorani, Bautista, and Sharma. Drafting of the manuscript: Casas, Hingorani, Bautista, and Sharma. Critical revision of the manuscript for important intellectual content: Hingorani, Bautista, and Sharma. Statistical analysis: Casas, Hingorani, Bautista, and Sharma. Obtained funding: Hingorani and Sharma. Administrative, technical, and material support: Hingorani and Sharma. Study supervision: Sharma.
Funding/Support: Dr Hingorani holds a Senior Fellowship from the British Heart Foundation, London.