Plot of dizygotic (DZ) vs monozygotic (MZ) tetrachoric correlation coefficients (circles) and the meta-analytic summary estimate (×) for 12 twin studies of schizophrenia. Tetrachoric correlations were estimated by Mx (a statistical data modeling software).36 The dashed line indicates the expected correlations if the familial resemblance is entirely due to common environmental effects; the solid line, the expected correlations if the familial resemblance is entirely due to additive genetic effects.
Graphical depiction of variance component estimates for 12 twin studies of schizophrenia and the meta-analytic summary. A, Additive genetic variance. B, Common environmental variance. The estimates and their 95% confidence regions were estimated by Mx (a statistical data modeling software),36 accounting for ascertainment probabilities. Horizontal bars indicate 95% confidence intervals; vertical bars, parameter estimates.
Sullivan PF, Kendler KS, Neale MC. Schizophrenia as a Complex TraitEvidence From a Meta-analysis of Twin Studies. Arch Gen Psychiatry. 2003;60(12):1187-1192. doi:10.1001/archpsyc.60.12.1187
Copyright 2003 American Medical Association. All Rights Reserved. Applicable FARS/DFARS Restrictions Apply to Government Use.2003
There are many published twin studies of schizophrenia. Although these studies have been reviewed previously, to our knowledge, no review has provided quantitative summary estimates of the impact of genes and environment on liability to schizophrenia that also accounted for the different ascertainment strategies used.
To calculate meta-analytic estimates of heritability in liability and shared and individual-specific environmental effects from the pooled twin data.
We used a structured literature search to identify all published twin studies of schizophrenia, including MEDLINE, dissertation, and books-in-print searches.
Of the 14 identified studies, 12 met the minimal inclusion criteria of systematic ascertainment.
By using a multigroup twin model, we found evidence for substantial additive genetic effects—the point estimate of heritability in liability to schizophrenia was 81% (95% confidence interval, 73%-90%). Notably, there was consistent evidence across these studies for common or shared environmental influences on liability to schizophrenia—joint estimate, 11% (95% confidence interval, 3%-19%).
Despite evidence of heterogeneity across studies, these meta-analytic results from 12 published twin studies of schizophrenia are consistent with a view of schizophrenia as a complex trait that results from genetic and environmental etiological influences. These results are broadly informative in that they provide no information about the specific identity of these etiological influences, but they do provide a component of a unifying empirical basis supporting the rationality of searches for underlying genetic and common environmental etiological factors.
GENETIC epidemiological studies of schizophrenia have had a guiding influence on schizophrenia research. In particular, twin and adoption studies that suggested substantial genetic influences on the liability to schizophrenia1 helped create the empirical rationale for numerous ongoing searches to identify the predisposing genetic loci.2,3 Recently, several groups4- 6 have presented evidence for genes that may be involved in the etiology of schizophrenia.
Although the primary twin studies of schizophrenia have been reviewed extensively7,8 and have been quantitatively summarized,9- 12 we are aware of no meta-analysis of the primary studies that incorporated ascertainment corrections. Failure to correct for ascertainment (the mixtures of singly and doubly ascertained twin pairs found in most of these studies) could bias the results.
The goal of this report was to conduct a quantitative meta-analysis of the published twin studies of schizophrenia. A key advantage of synthetic meta-analytic vs traditional literature reviews is the potential to yield less biased quantitative summaries of the findings of many primary studies.13- 16 We were also interested in an additional benefit of meta-analysis: its capacity to detect subtle effects for which individual studies may possess insufficient statistical power. Twin studies of uncommon discrete traits, like those for schizophrenia, possess limited statistical power in many circumstances (unless the sample or effect sizes are large).17 Only when the results of several studies are analyzed jointly in a meta-analysis can subtle effects be assessed with a reasonable degree of confidence.
To identify all relevant primary studies, we performed computerized PubMed searches for an inclusive list of descriptors and searched the reference lists of prior reviews of schizophrenia to identify any reports not retrieved in the PubMed search. We identified 14 published studies18- 31 of schizophrenia from independent samples in 6 European countries, Japan, and the United States (Table 1). If there were multiple publications from the same sample, only the most recent was included.
The next step in a meta-analysis is usually the application of a set of inclusion criteria. For twin studies, these include systematic recruitment, blinding to co-twin diagnostic status and zygosity, and use of systematic data collection and diagnostic procedures. Many of these studies were conducted before these criteria were common in psychiatric research (718- 24 of the 14 studies were published before 1970). Only 4 studies27- 29,31 met all the inclusion criteria, and all were published after 1990. We considered studies that relied solely on hospitalization records as not being blind because the clinicians making the diagnosis may have been influenced by the co-twin history.
We chose to relax our a priori inclusion criteria for 4 reasons. First, the exclusion of most studies is not consistent with our desire to obtain meta-analytic estimates of what might be subtle statistical effects. Second, we retained the capacity to analyze the methodologically superior vs inferior studies. Third, we wanted to avoid a bias of modernity by including older studies. Many of the older studies were performed by prominent and highly respected researchers who conducted the studies in rigorous accordance with the accepted research practices of their era, and contain information pertinent to our research question. Finally, we were influenced by tradition: most prior reviews7,11 of twin studies of schizophrenia have included most of these studies. Therefore, our final exclusion criterion was that these studies ascertain subjects in a systematic manner, which led to the exclusion of 2 studies and the inclusion of 12 studies, as shown in Table 1. Nearly all of these primary studies were believed to support the strong role of genetic factors in the etiology of schizophrenia.
A model for the patterns of familial resemblance was used to predict the observed concordant and discordant pair frequencies. The model included parameters for additive genetic (a2), common environmental (c2), and individual-specific environmental (e2) components of variance.32 Additive genetic influences are shared completely by monozygotic (MZ) twins and correlate 0.5 between dizygotic (DZ) twins. Common environmental influences are shared completely by the members of a twin pair regardless of zygosity, and account for DZ correlations that are greater than half of the MZ correlation. Individual-specific environmental components contribute separately to each twin and, therefore, account for less than a perfect resemblance between MZ twins. These parameters were estimated by maximum likelihood.
The likelihood of the observed pairs of twins was computed using a bivariate normal threshold model of liability.33 This model specifies that individuals in the population have a frequency distribution of liability to schizophrenia that is described by a normal distribution. On this liability continuum, there exists a threshold t such that individuals above the threshold have schizophrenia and those below do not. The distribution of twin pair liabilities is bivariate normal, with unit variances and correlations as predicted by the following statistical models: rMZ = a2 + c2 and rDZ = 0.5a2 + c2. Although it is usual to fit submodels that eliminate specific sources of variance, this practice is inappropriate in a meta-analysis, and a common approach to selecting parsimonious models (via the Akaike information content) can select incorrect models.34
Four different types of subject ascertainment were used in the 12 twin studies in Table 1. Each type of ascertainment is associated with different types of data. First, random population ascertainment yields a full 2 × 2 contingency table (diagnosis of twin 1 × diagnosis of twin 2). In this instance, all 4 types of twin pairs are observed directly (ie, concordant unaffected twin pairs, the 2 discordant cells, and concordant affected twin pairs), as in 2 twin studies26,30 of schizophrenia. Second, complete ascertainment is when concordant unaffected twin pairs are the only pairs not observed, so that concordant and discordant affected twins are ascertained, as in 4 twin studies23- 25,27 of schizophrenia. Third, 2 twin studies29,35 of schizophrenia used single ascertainment, in which pairs with 2 probands are not observed and only 1 of the 2 possible discordant cells is observed together with concordant affected pairs. Finally, incomplete ascertainment is intermediate between complete and single ascertainment. The key quantity is π, the probability of being ascertained given that one is affected. The π value is 1 for complete ascertainment, approaches 0 for single ascertainment, and 0<π<1 for incomplete ascertainment. The π value can be estimated as 2D/(2D + S), where D is the number of doubly ascertained pairs and S is the number of singly ascertained pairs. Four twin studies20,21,28,31 of schizophrenia had incomplete ascertainment.
Each type of ascertainment requires a different ascertainment correction. For random population ascertainment, no correction for ascertainment is required because pairs are thought to be representative of the population and the likelihood is as follows:
where ϕ is the bivariate normal probability density function; x1 and x2 are the liabilities to schizophrenia for twin 1 and twin 2, respectively; and ai and bi are a0 = −∞, a1 = t, b0 = t, and b1 = ∞ (where the subscript 0 denotes unaffected; and 1, affected).
Under nonrandom ascertainment, the likelihood of pairs may be written as equation 1 multiplied by an ascertainment correction. Under complete ascertainment(π = 1), the ascertainment correction is as follows:
Under single ascertainment (π→0), the correction is as follows:
Under incomplete ascertainment (0<π<1), the correction for single-proband pairs is as follows:
For double-proband pairs, the equation is as follows:
A script that uses the appropriate correction for ascertainment and the population threshold estimate for simultaneous analysis of the 12 studies is available on the Mx Examples Web site (http://www.vcu.edu/mx/examples.html).36 To test for heterogeneity, the parameters of the model (a2 and c2) were set equal across the studies, and the fit of this model was compared via the likelihood-ratio test to the fit of the model in which the a2, c2, and e2 parameters were allowed to differ between studies. Because the variance of the latent distribution cannot be estimated separately from the thresholds, we used the common practice of standardizing to unit variance by imposing the following nonlinear constraint: a2 + c2 + e2 = 1. Effectively, there are only 2 free parameters per sample (a2 and c2), so the comparison tests for heterogeneity have a df = 2 for each sample beyond the first.
Figure 1 depicts the MZ and DZ twin correlations from 12 twin studies of schizophrenia and the meta-analytic summary estimates. Seven of the 12 studies and the meta-analytic summary estimate(rMZ = 0.92; 95% confidence interval [CI], 0.91-0.94; and rDZ = 0.52; 95% CI, 0.48-0.56) lie between lines depicting extreme cases in which a trait is entirely due to additive genetic effects (100% a2) or entirely due to common environmental effects (100% c2), suggesting the presence of both additive genetic and common environmental effects in the etiology of schizophrenia.
Figure 2 presents the variance component estimates from Mx for 12 twin studies of schizophrenia and the meta-analytic summary estimates. Figure 2A depicts additive genetic effects, and Figure 2B depicts common environmental effects.
For additive genetic effects, the point estimates from all but 2 studies23,24 are in excess of 50%. The 95% CIs for the estimates are often large because of the relatively small sample sizes of the individual studies. The meta-analytic summary for additive genetic variance in liability to schizophrenia was estimated at 81% (95% CI, 73%-90%). The 95% CI for the joint estimate overlapped with the 95% CIs for 1019- 21,23,26- 31 of the 12 studies.
For common environmental effects, the point estimates for 719- 21,23- 25,29 of the 12 studies were nonzero, with 6 greater than 15%. The 95% CIs tended to be large owing to the limited statistical power to detect this effect in the individual studies. The meta-analytic summary for common environmental effects for the liability to schizophrenia was estimated at 11% and, notably, the 95% CI did not include 0 (95% CI, 3%-19%). The 95% CI for this estimate overlapped with the CIs of all but 223,24 of the 12 studies.
An inspection of Figure 1 and Figure 2 suggests that there is heterogeneity across the 12 individual twin studies. A formal test for homogeneity was strongly rejected (χ220 = 113.9, P<.001). Critically, when we compared the methodologically superior studies27- 29,31 with the methodologically inferior studies,19- 21,23- 26,30 we found similar point estimates for additive genetic effects (77% vs 78%) and common environmental effects (17% vs 14%).
In addition, 623- 26,30,31 of the 12 twin studies used unique population prevalence estimates for schizophrenia. Because prevalence can influence the variance component estimates, we performed the analyses again, with all studies forced to have population prevalences for schizophrenia of 0.5%, 0.75%, and 1%. The pattern of results was similar to that described previously, with high estimates of additive genetic effects and estimates of common environmental effects whose 95% CIs did not include 0. Estimates of common environmental effects were larger with decreasing prevalence.
One principal result of this quantitative meta-analysis of 12 published twin studies of schizophrenia was expected, and the other was quite surprising. Consistent with prior summaries of the twin literature on schizophrenia, the meta-analytic summary estimate of its heritability is high (point estimate, 81%; 95% CI, 73%-90%), and this result may provide a useful summary of a diverse literature. However, we also determined that there are small but significant common environmental effects on liability to schizophrenia (point estimate, 11%; 95% CI, 3%-19%). This latter estimate is unexpected and of considerable interest.
Before discussing these findings further, it is imperative that we consider 2 key limitations. First, the methodological quality of the published twin studies of schizophrenia was not uniformly high compared with, for example, that for major depression37 or various smoking behaviors.38 Most studies in this work did not include several critical features (blinding and a standardized diagnostic approach) that are generally viewed as central to the interpretability of twin studies of medical disorders. In fairness, most of these twin studies of schizophrenia were conducted before the importance of these methodological features was widely recognized and viewed as essential. In fact, many of the earlier twin studies represented monumental and even heroic efforts by individual psychiatrists who devoted years of personal effort despite limited resources to accumulate the samples quantitatively summarized herein. For the reasons described in the "Primary Studies" subsection of the "Methods" section, we chose to include 12 of 14 studies that met post hoc modifications of our a priori criteria.
Second, the 12 studies included in this meta-analysis of schizophrenia were statistically heterogeneous. The presence of heterogeneity of the variance component estimates raises 2 questions—What is the source of heterogeneity? and Does it limit the validity of the meta-analytic estimates? It is likely that differences in methods across studies are a source of heterogeneity given that the 4 studies27- 29,31 that met our a priori inclusion criteria had greater evidence for the homogeneity of the components of variance. This is unsurprising, and is consistent with the increased reliability generally found with more rigorous methodological approaches to psychiatric diagnosis. In addition, heterogeneity could have resulted from other sample-specific characteristics, like the male-female or MZ/DZ ratios. However, it is also possible that there exists true variation in the etiology of schizophrenia during the decades spanned by these studies or across the different countries and ethnic ancestries of the individuals in these studies. There are reports of associations of schizophrenia with potential etiological factors that would be expected to vary across populations and over time, such as discrete periods of famine,39 season of birth,40 or prenatal exposure to influenza or other viral infections.41- 43
The presence of heterogeneity across studies does not necessarily invalidate the meta-analytic approach we used to summarize these studies. Rather, there is a set of advantages and disadvantages. The critical advantage of including heterogeneous studies is the capacity to summarize the variance component estimates when there is true etiological variation. It is unlikely that schizophrenia has a single etiology, but instead is an end result of heterogeneous processes that result in a similar clinical portrait. Therefore, the inclusion of heterogeneous studies is consistent with this belief and offers the potential of a more accurate and less sample-specific summary of the fundamental nature of this complex illness. On the other hand, the critical disadvantage is if heterogeneity is an index of a shared methodological flaw (eg, bias in recruitment or diagnosis). In this instance, heterogeneity reflects the presence of individual studies that are flawed and could result in imprecise and inaccurate estimates.
On balance, we believe that it is more advantageous than disadvantageous to include heterogeneous studies. This contention is supported by the similarity of parameter estimates from the methodologically superior vs inferior studies. In fact, the point estimate for common environmental effects was higher in the superior studies (17%) than in the inferior studies (14%).
The most notable finding from this meta-analysis was that variance in liability to schizophrenia was estimated to have a nonzero contribution of environmental influences shared by members of a twin pair. This finding is ironic because it is unusual to find a behavioral trait or disorder with significant common environmental influences44,45 and schizophrenia is often described as one of the more "genetic" psychiatric disorders. The magnitude of the finding suggests that these influences, while significant, have a modest impact on liability to schizophrenia.
When considering this surprising finding further, we discovered that significant common environmental effects for schizophrenia were reported previously by Rao et al (20%)9 and McGue et al (19%).10 When these 2 articles were published in the early 1980s, there was a sharp division within psychiatry as to whether schizophrenia resulted from biological/genetic factors or environmental factors, such as adverse maternal-child relationships. These perspectives were often framed as mutually exclusive (nature or nurture). The stronger genetic component to schizophrenia was clearly the more influential result from these articles. Our rediscovery of subtle, but nontrivial, common environmental effects for schizophrenia is likely to be interpreted differently than in the 1980s.
The traditional phrases, common environment and shared environment, are misnomers in that they generally evoke psychiatric risk factors like parental rearing behavior and traumatic life events. In the context of the assumptions and definitions of twin analyses, however, common environment refers to any process that makes members of a twin pair similar regardless of zygosity. These processes include the classic environmental factors previously noted, but also encompass profoundly biological processes such as exposure to infectious agents, macronutrient or micronutrient dietary characteristics, and exposure to environmental toxins, teratogens, and other intrauterine factors. In addition, it is possible that some portions of the significant common environmental effects are artifactual (eg, due to assortative mating or a substantially biased zygosity assignment).
Moreover, the environments of members of twin pairs tend to diverge over time. The environments of twins are most similar in utero and in the immediate postnatal period, with increasing divergence over infancy, childhood, adolescence, and adulthood. Therefore, the presence of significant common environmental effects on liability to schizophrenia suggests that these effects would be most likely to occur early in life. This prediction is consistent with a neurodevelopmental etiology of schizophrenia46- 48 and with the literature on early risk factors for schizophrenia. For example, several reviews49,50—including a recent meta-analysis51 of large, prospective, and population-based studies—found consistent evidence to support the status of pregnancy complications, abnormal fetal development, and delivery complications as risk factors for schizophrenia. An additional report52 strongly suggests the importance of maternal-fetal Rh blood group incompatibility as a specific risk factor for schizophrenia.
In conclusion, these meta-analytic results from 12 published twin studies of schizophrenia support a view of schizophrenia as a complex trait that results from both genetic and shared environmental etiological influences. These results are broadly informative in that they provide no information about the specific number or identity of these etiological influences, but provide a component of a unifying empirical basis supporting the rationality of searches for underlying genetic and common environmental etiological factors.
Corresponding author: Patrick F. Sullivan, MD, FRANZCP, Department of Genetics, University of North Carolina at Chapel Hill, Campus Box 7264, Chapel Hill, NC 27599 (e-mail: firstname.lastname@example.org).
Submitted for publication December 2, 2002; final revision received April 2, 2003; accepted April 11, 2003.
This study was supported by grant MH-01458 from the National Institute of Mental Health, Bethesda, Md (Dr Neale).
We thank Irving I. Gottesman, PhD, for critical comments on earlier drafts of this article.