Magnetic resonance imaging segmentation, showing an acquired T2-weighted image (A), a proton density image (B), and the segmented image (C), where gray matter is depicted in white, white matter in light gray, and cerebrospinal fluid in black.
Gur RE, Turetsky BI, Bilker WB, Gur RC. Reduced Gray Matter Volume in Schizophrenia. Arch Gen Psychiatry. 1999;56(10):905–911. doi:10.1001/archpsyc.56.10.905
There is emerging evidence that gray matter (GM) is reduced in patients with schizophrenia. Information on the extent of global differences in the 3 principal supertentorial compartments is necessary for interpretation of regional effects. The relation of GM reduction to clinical status and neurocognition also requires examination.
Magnetic resonance imaging, neurocognitive measures, and clinical assessment of symptoms and functioning were obtained for 130 patients (51 neuroleptic naive, 79 previously treated) and 130 healthy controls (75 men, 55 women in each group).
Overall GM volume was reduced in patients compared with controls. This was evident in men (6% reduction) and women (2% reduction) and was already evident at the first presentation of neuroleptic-naive patients. The reduction sustained correction for age and total intracranial volume. Compartmental volumes did not correlate with the severity of positive (r, −0.08 to 0.23) or negative (r, −0.01 to −0.07) symptoms, but GM volume was associated with better premorbid functioning in women (r, 0.36-0.51). Small but significant correlations (r, 0.19-0.44) were observed between GM volume and performance in 6 neurocognitive domains. These correlations varied by diagnosis, most higher in patients, and were moderated by sex.
Gray matter volume reduction in schizophrenia is already evident in men and women at first presentation. While this reduction is not correlated with symptom severity, it is associated with cognitive performance. Since GM development accelerates in the later part of gestation, while white matter growth is primarily postnatal, the results may support the hypothesis that neurodevelopmental processes relate to GM deficit.
NEUROANATOMICAL studies in schizophrenia indicate volume reduction and increased cerebrospinal fluid (CSF).1,2 Magnetic resonance imaging (MRI) enables segmentation of parenchyma into gray matter (GM) and white matter (WM) and permits evaluation of all major intracranial compartments related to cytoarchitecture and connectivity, including GM (the somatodendritic compartment of neurons [cortical and deep]), WM (the axonal compartment of myelinated connecting fibers), and CSF.3- 6 Interpretation of findings for specific structures depends on neuroanatomical measures for the entire supertentorium; these can contribute to understanding aberrant developmental processes in schizophrenia.
Diffuse GM reduction was noted in a study of 22 chronically ill male patients compared with 20 healthy men.7 This effect was replicated in a sample of 22 men with chronic schizophrenia and 27 healthy men.8 Gray matter reduction, evident in 57 men with schizophrenia from a state hospital, did not correlate with age of disease onset, which was interpreted as suggesting that the abnormalities had been present before psychotic symptoms emerged.9 Reduction was also found in 19 chronically ill women from that facility.10 The suggestion that GM deficits exist at clinical presentation was confirmed by reports from studies that examined patients with first-episode schizophrenia. A sample comparing 22 patients with first-episode schizophrenia with 51 healthy controls found that the patients had reduced cortical GM tissue.11 Similarly, reduction in GM volume was reported in 77 patients with first-episode nonaffective psychosis compared with 61 healthy controls.12
Gray matter volume has been examined in relation to symptom severity and neurocognitive deficits. Severity of symptoms on the Brief Psychiatric Rating Scale (BPRS) Withdrawal-Retardation factor was correlated with reduced GM,7 but this was not replicated in subsequent studies of patients with chronic8- 10 and first-episode11 illness. Regarding neurocognitive measures, GM was correlated with better performance on global and specific measures of neuropsychological functioning.13,14 Gray matter was also correlated with higher IQ estimates in patients,12 but not in healthy controls. Thus, there are suggestions that GM deficit in schizophrenia may relate to neurocognitive deficits and not to clinical variables associated with disease duration and severity.
Few integrative studies have applied neuroanatomical, clinical, and neurocognitive measures across men and women with first-episode and chronic schizophrenia. We have reported data examining volumes of whole brain, CSF, and specific regions.2,15- 20 However, we have not examined GM and WM compartments in schizophrenia. The goal of this study was to investigate the relations between the 3 cranial compartments and clinical and neurocognitive measures. We hypothesized that GM deficit is evident in schizophrenia for both men and women with first-episode and previously treated illness, is related to neurocognitive measures, and is associated with poorer premorbid functioning, more severe symptoms, and a lower quality of life.
Participants were 130 patients with schizophrenia and 130 healthy controls (75 men and 55 women in each group) from the Schizophrenia Center of the University of Pennsylvania, Philadelphia. Controls were selected from a larger sample to match patients sociodemographically. All were right-handed. Patients had a DSM-IV21 diagnosis of schizophrenia established by medical, neurological, and psychiatric evaluations22 using the Structured Clinical Interview for DSM-IV–Patient Version.23 Those with schizophreniform disorder at study entry met the criteria for schizophrenia at the 6-month follow-up. The healthy controls, recruited by newspaper advertisement, underwent medical, neurological, and psychiatric evaluations24 using the Structured Clinical Interview for DSM-IV–Nonpatient Version.25 Subjects had no history of a disorder or event that might affect brain function, including hypertension (blood pressure >140/90 mm Hg), cardiac disease, diabetes mellitus, endocrine disorders, renal disease, chronic obstructive pulmonary disease, cerebrovascular disease, head trauma with loss of consciousness, seizure disorder, migraines, or any other neurological condition. The groups did not differ (mean±SD) in age (patients, 29.2±7.5 years; controls, 27.7±6.0 years) or parental education (patients, 12.9±2.5 years; controls, 12.4±2.8 years), but as expected, patients had a lower educational level than controls (12.9±2.3 vs 14.9±2.0 years [t258=7.34, P<.001]). There were 51 neuroleptic-naive and 79 previously treated patients. The neuroleptic-naive patients were younger (men: 23.4±5.2 years; women: 27.2±7.2 years) than the previously treated patients (men: 31.5±6.6 years; women: 32.2±7.7 years [t73=5.64, P<.001 for men vs t53=2.40, P=.02 for women]), but did not differ in education or parental education. The previously treated patients received typical neuroleptics only (n=48), typical followed by atypical neuroleptics (n=22), or atypical neuroleptics only (n=9). Mean duration of treatment for the typical neuroleptics was 1094.7±1431.3 days (range, 2-5739 days), and the mean dosage was 546.4±501.8 chlorpromazine-equivalent units per day (range, 25-1916 chlorpromazine-equivalent units per day). Atypical neuroleptics included clozapine (n=15) and risperidone (n=18). Mean duration of treatment for the atypical neuroleptics was 714.0±712.7 days (range, 2-2765 days), and the mean dosage was 463.3±336.4 chlorpromazine-equivalent units per day (range, 4-1295.1 chlorpromazine-equivalent units per day). For patients treated only with atypical neuroleptics, the mean duration of treatment was 435.28±516.1 days (range, 2-1232 days), and the mean dosage was 223.6±117.6 chlorpromazine-equivalent units per day (range, 100-428.5 chlorpromazine-equivalent units per day). Age of onset of psychotic symptoms in the context of functional decline was 22.4±5.9 years and duration of illness was 6.8±6.2 years. Clinical assessments, neurocognitive testing, and MRI studies were conducted within a week after enrollment. All neuroleptic-naive patients were studied before therapeutic intervention. Of the previously treated patients, 51 presented to the center and were studied while off medication, and 28 were studied while on medication. After complete description of the study, written informed consent was obtained prior to participation.
Assessments of symptoms and level of function were performed by trained, reliable investigators (intraclass correlation coefficient >0.85) with established procedures.22 Symptom ratings included the Scale for Assessment of Negative Symptoms (SANS),26 Scale for Assessment of Positive Symptoms (SAPS),27 and BPRS.28 Assessment of premorbid adjustment29 included measures of academic and social functioning during childhood, early and late adolescence, and adulthood. The Quality of Life Scale30 assessed current functioning in social and occupational domains and activities of daily living.
The neuropsychological assessment battery examines the following neurocognitive domains: abstraction-flexibility, attention, verbal memory, spatial memory, verbal abilities, and spatial abilities. Specific tests and procedures were detailed previously.31,32
Magnetic resonance imaging scans were acquired on a GE Signa 1.5-Tesla scanner (General Electric, Milwaukee, Wis). Transaxial images were obtained in planes parallel to the orbitomeatal line, with in-plane resolution of 0.859×0.859 mm, 5-mm slice thickness, and no interslice gaps. A multiecho acquisition sequence was used (repetition time, 3000 milliseconds; echo time, 30 and 80 milliseconds) to provide both proton density and T2-weighted images that can be reliably segmented in a way that is robust against field inhomogeneities (shading artifacts). Images were resliced along the anterior-to-posterior commissural axis to standardize for head tilt and imported electronically into the segmentation software package.
Magnetic resonance imaging scans were evaluated neuroradiologically for technical quality and gross abnormalities; none were found. Only supertentorial tissue was included in the analyses; thus, the cerebellum and brainstem nuclei were excluded. This was done using established guidelines.2
The brain volume was extracted by automatically stripping scalp, skull, and meninges using optimal thresholding and morphological operations on the image intensity and chamfer distance (an easily computed approximation of the distance from any given point to the head surface).33,34 Some nonbrain regions, such as bone marrow and the eyeballs, could not be reliably stripped by this algorithm and were removed manually in an interactive program. The stripped MRI image was segmented into GM, WM, and CSF using an adaptive Bayesian algorithm35,36 that models the image as a collection of tissue compartments with slowly varying mean intensity plus white gaussian noise. The mean intensity within each compartment was estimated by the least-squares method fitting to a cubic B spline.37 This helps overcome "shading" effects and reduces partial voluming that can bias against small, isolated regions (eg, sulcal CSF). Spatial interactions among adjacent voxel labels were modeled as a Markov random field with a 3-dimensional second-order neighborhood system, in which different potentials are used for the in-plane and axial directions to account for anisotropic voxel dimensions. The algorithm does an initial segmentation using K-means clustering on image intensity. The segmentation is then iteratively improved by repeatedly estimating the (spatially varying) mean intensity of each compartment by fitting a B spline over the entire image, resegmenting the image into compartments by maximizing the a posteriori probability using the iterated conditional modes algorithm.38 The number of spline control points is gradually increased. Combining spline representation and adaptation makes the segmentation more accurate and robust (Figure 1). It yields volumes that are 4.6% higher for estimated brain volume and 10.9% higher for CSF but correlates highly (r, 0.98 and 0.92 for brain and CSF volumes, respectively) with the method used in our previous segmentation studies on subsamples of the present group of participants.2,15- 19
Because age is an important factor in brain volume measures, we first examined the effects of age in each group. Despite the restricted age range of this sample (18-45 years), some correlations of volume with age, while small, reached significance. We therefore repeated all the analyses covarying for age, and this did not change the significance of our findings. Since our samples did not differ in age, we will report results for the raw data, uncorrected for age effects, to facilitate comparison with other published values.
Our first hypothesis regarding differentially reduced GM in patients was tested using multivariate analysis of variance (MANOVA), with diagnosis and sex as grouping factors and brain compartment (GM, WM, and CSF) as a repeated-measures (within-group) factor. We tested our second hypothesis by computing the correlations between GM volume and global performance measure averages for the 6 neurocognitive domains. A significant correlation would justify examination of specific domains. Our final hypothesis was similarly tested by correlating GM with global measures of premorbid function (the average of premorbid adjustment subscales), severity of symptoms (BPRS, global SANS, and SAPS), and quality of life (Quality of Life Scale). Significance of global correlations prompted examination of subscales. We also compared first-episode, neuroleptic-naive patients with previously treated patients and deficit (n=44) with nondeficit (n=86) subtypes.39
The large sample size enabled application of a new method for testing whether the correlations among a set of measurements with a common variable are homogeneous, and whether this differs between groups. The difficulty is that the correlations all involve a common variable, leading to "correlated correlations." Within-sample homogeneity of correlations can be tested using a method developed by Olkin and Finn.40 In our case, this would apply to testing the homogeneity of the correlations of GM volume with a dependent measure, such as each of the cognitive scores. Although not statistically equivalent, one could think of this in the context of a 1-way ANOVA for correlated correlations (CORANOVA), with 1 between-group and 1 within-group factor. In our case, we have a between-group factor with 2 levels, either sex or diagnosis. We have developed a procedure for simultaneously testing the effects for 1 within-group factor, 1 between-group factor, and the interaction of these 2 factors that is analogous to a 2-way CORANOVA. The CORANOVA method41 uses a bootstrap procedure to estimate the covariance matrix for the correlated correlations of the 2 groups. A χ2 statistic is used to test an appropriate contrast for each of the 3 effects. To avoid normality assumptions, a permutation distribution under the null hypothesis is simulated to determine the P value for the significance test. The properties of this test procedure have been carefully checked through a series of intensive simulations, and these have demonstrated good power to test the homogeneity hypothesis. Based on our simulations, we estimate that the power for detecting main effects of 2 SDs will be more than 90%, and for a group×domain interaction approximately 79%.
Patients had, on average, a mild to moderate premorbid course (Table 1), with women doing better (t122.2=3.91, P<.001) (degrees of freedom for unequal variances because F‘74,54=2.97, P<.001). The BPRS, SAPS, and SANS ratings likewise reflect mild to moderate severity of illness. While men and women did not differ on the BPRS and SAPS, women had lower severity of negative symptoms on the SANS (t128=2.30, P=.02). Women were also less impaired on the quality of life ratings (t128=2.24, P=.03, all 2 tailed). There were no differences in severity of symptoms between first-episode and previously treated patients. First-episode patients, however, were less impaired than previously treated patients in quality of life measures of social functioning (t128=3.46, P<.001), engagement (t128=2.37, P<.02), and vocational functioning (t128=3.52, P<.001). These effects were similar in men and women.
Each neurocognitive domain consists of averaged standardized tests, using z scores based on the means and SDs of the normative sample (Table 2). A diagnosis×sex×functional domain MANOVA showed a main effect for diagnosis (F6,251=40.67, P<.001), with patients performing more poorly, and a diagnosis×function interaction (F5,252=15.74, P<.001), indicating differential deficit in specific functions. There was also a sex×function interaction (F5,252=4.24, P<.001), indicating that men and women had relatively different neurocognitive strengths and weaknesses, but there was no interaction involving both sex and diagnosis.
The means for patients and controls for all compartmental volumes (Table 3) were dependent measures in a diagnosis×sex×hemisphere MANOVA. There were main effects for diagnosis (F6,251=3.21, P=.005), indicating that patients had overall smaller intracranial volumes; sex (F6,251=26.8, P<.001), with men having larger intracranial volume; and compartment (F2,255=8558.15, P<.001), indicating that compartments differed in volumes across groups. A significant compartment×diagnosis interaction (F2,255=8.45, P<.001) indicated that patients differed from controls in specific compartments, and a sex×compartment interaction (F2,255=44.23, P<.001) indicated that men and women differed in compartmental composition. No other effects were significant. Follow-up MANOVAs for each compartment supported the hypothesis by showing a main effect of diagnosis for GM (F2,255=7.95, P<.001), with patients having lower volumes. This difference was not significant for WM or total CSF (F<1). The analyses were repeated with cranial volume entered as a covariate without altering the significance of effects.
While overall CSF volume did not differ between patients and controls, given the neurodevelopmental implications of increased ventricular or sulcal CSF,42 we performed diagnosis×sex MANOVAs separately on sulcal and ventricular CSF. There were different effects in these subcompartments. For ventricular CSF, there was a main effect of diagnosis (F2,255=7.38, P<.001), with patients having higher volumes. There was also an interaction of diagnosis×sex (F2,255=4.08, P=.03), reflecting a larger difference between patients and controls in women. For sulcal CSF, there was a main effect only for sex (F2,255=4.94, P=.01), with men having higher volumes. In view of earlier findings of increased sulcal CSF in a subsample2 and in other studies,9,43 we have examined the correlations between ventricular and sulcal CSF and found them moderate in healthy subjects (men: r73=0.44, P<.001; women: r53=0.45, P<.001) and higher in patients (men: r73=0.70, P<.001; women: r53=0.54, P<.001). Thus, the present results suggest that the primary increase is in ventricular CSF. Analyses within patient groups were performed grouping by neuroleptic status (neuroleptic-naive compared with previously treated) and by the deficit-nondeficit classification. No significant effects or interactions were found with these grouping factors.
The correlations between GM volume and overall performance were small but significant for patients and for controls (r128=0.31 and r128=0.28, respectively; P<.001), indicating that increased GM is associated with better performance on neurocognitive tasks. The breakdown of correlations by domain for each sex suggests variability in the magnitude of these correlations (Table 4).
The CORANOVA for testing the significance of the apparent inhomogeneity of correlations was carried out in 2 steps: first, sex differences across patients and controls and then diagnosis effects across the entire sample. The CORANOVA contrasting men and women showed a main effect of neurocognitive domain (P=.007), indicating that the correlations were not homogeneous across neurocognitive measures. There was no main effect of sex (P=.20) and no domain×sex interaction (P=.40). The CORANOVA contrasting patients with controls likewise showed the main effect of neurocognitive domain (P=.007). There was no main effect for diagnosis, indicating that the correlations were equally high for patients and controls, but there was a diagnosis×domain interaction (P=.04), suggesting that these correlations are attenuated or accentuated for some domains in patients relative to controls. Indeed, of 12 correlations computed between neurocognitive domains and GM volumes (6 for men, 6 for women), 9 correlations were higher in patients (Table 4). However, this was moderated by sex; 5 of 6 correlations were higher in male than in female patients, whereas only 3 were higher in male controls.
The premorbid adjustment scale indicated mild associations between higher parenchymal volume and better adjustment during late adolescence and early adulthood. Examination by sex showed that the correlations were significant only for women (r, 0.45, 0.51, and 0.36 for early and late adolescence and adult adjustment, respectively; P<.01). These correlations did not reach significance in men. Symptom severity measures did not correlate with compartmental volumes. Likewise, illness duration was not associated with GM (r128=−0.16, P=.07 [2 tailed]), and the quality of life measures yielded negligible correlations.
A global reduction of GM volume is evident for both men and women in schizophrenia, whether they are neuroleptic-naive or long-term patients. The results corroborate reports on subpopulations in which varying portions of brain were examined.7- 12 Our automated MRI segmentation method has demonstrated the generalizability of the finding to the entire supertentorial cranium. The sample size, composition, and diagnostic rigor, with no comorbid conditions, permit the conclusion that schizophrenia selectively affects global GM volume. Furthermore, the reduction of GM is evident prior to therapeutic intervention; hence, the processes that result in this reduction must be active before the initiation of treatment. In the present sample, a weak effect supporting progression, namely the correlation between illness duration and lower GM, did not sustain correction for age effects. While this pattern of results seems to favor the neurodevelopmental hypothesis, only longitudinal studies can more definitively examine whether there is a further decrease in GM volume beyond that expected from aging.42
Developmentally, accelerated cortical GM volume increase occurs during neuronal differentiation (at 30-40 weeks of gestation) after neuronal migration to the cerebral cortex. During this period, dendritic and axonal branching and gyral formation are evident.44- 46 A recent MRI study of premature and mature newborns reported a 4-fold increase in cortical GM with evidence of gyral development during the last trimester.47 By birth, GM already occupied 50% of the intracranial volume, which is close to the ratio we observed in adults. On the other hand, the formation of myelinated WM accelerates later, and by birth it only reaches 5%, compared with about 40% in adults. The specificity of GM reduction in schizophrenia is consistent with the hypothesis of abnormal brain development during gestation that spares WM, the growth of which accelerates postnatally. It should be noted that some studies have reported WM abnormalities in schizophrenia,43,48 although a recent review concludes, as we do, that GM reduction is robust, whereas findings for WM are variable.49
The finding that ventricular CSF is increased in patients whereas sulcal CSF is not could be considered to support the hypothesis that diffuse brain tissue loss is limited to the prenatal or perinatal period. Indeed, our findings of reduced intracranial size and increased CSF limited to the ventricles fit precisely the predictions articulated by Woods42 as most consistent with injury during that epoch. The specificity of parenchymal reduction to GM can help further pinpoint the last trimester of pregnancy as the time when brain development is inhibited in schizophrenia. On the other hand, studies have also reported increased sulcal CSF,9,43 including subsamples of the present study.2 Given the correlations between ventricular and sulcal CSF volumes, which are higher in patients than in controls, the earlier findings of increased sulcal CSF may reflect inadequate representation in smaller samples of CSF variability in healthy subjects. Since the present sample is the largest, we believe it has sufficient power to resolve this issue.
Correlations between brain volume and performance measures have generally been small but consistent.50- 52 We found that the magnitude of the association differs depending on the specific neurocognitive domain (ranging from small but significant [r=0.19] to moderate [r=0.44]) and accounts for up to one fifth of the variance. Diagnosis and sex interact with the heterogeneity of these correlations. The interpretation of such associations, and particularly of differences in their magnitude, requires caution. An association between GM volume and performance suggests that availability of GM is beneficial for performance. A higher correlation for one neurobehavioral domain relative to another could emanate from differences among tasks in the requirement for global availability of GM. For example, some functions may depend more on local GM in frontal regions, while others may require distributed GM availability. Dissimilarity in correlations between groups could reflect variability in cognitive strategies or neuroanatomical differences in the abundance of GM. Such questions can be addressed by analysis of regional correlations.
The lack of association of GM volume and clinical measures of symptoms corroborates previous reports.7- 13 While this may relate to symptom variation over time, we did not observe differences in brain volume measures between deficit-type patients with enduring negative symptoms and nondeficit patients. Premorbid adolescence and adult adjustment related to GM occurs only in women, who often present later than men and have milder negative symptoms.53
This study was limited to evaluation of whole brain volume measures and examined only supertentorial compartments. It also used a specific segmentation algorithm that enables simultaneous segmentation of all brain compartments. This method has established reliability and validity in phantom studies and has demonstrated sensitivity to normal sex differences in intracranial compartmental composition,54 but thinner slices would be more optimal for examining subregions where segmentation of GM and WM is of interest. Further examination of subregions may link symptoms to specific brain systems implicated in modulating pathological behavior.
Accepted for publication June 16, 1999.
This research was supported by grants MH-42191, MH-43880, MH-01336 (Dr R. E. Gur), and MO1RR0040 from the National Institutes of Health, Bethesda, Md.
We thank Veda Maany for assistance in image processing and Stephen Moelter, MS, for assistance in manuscript preparation.
Corresponding author: Raquel E. Gur, MD, PhD, Neuropsychiatry Program, University of Pennsylvania, 10th Floor Gates Building, Philadelphia, PA 19104 (e-mail: email@example.com).