A, Sagittal and coronal slices of the hippocampus and the amygdala (Montreal Neurological Institute [MNI] coordinates x = −17, y = −4) depict decreased gray matter volumes in individuals with bipolar depression (BD) compared with individuals with unipolar depression (UD) for the Münster sample and the Pittsburgh sample. B, Sagittal and coronal slices of the rostral anterior cingulate gyrus (MNI coordinates x = −7, y = 38) depict decreased gray matter volumes in individuals with UD compared with individuals with BD for both samples. For display reasons, restricted region-of-interest analyses of the amygdala, the hippocampus, and the anterior cingulate gyrus, as defined by the automated anatomical labeling (Wake Forest University PickAtlas) using an uncorrected statistical threshold of P < .01, were conducted.
eAppendix 1. Comorbidities and Exclusion Criteria
eAppendix 2. Data Collection
eAppendix 3. Methods Voxel Based Morphometry and Univariate Analyses
eAppendix 4. Support Vector Machines and Gaussian Process Classifier
eAppendix 5. White Matter Results
eAppendix 6. Sensitivity and Specificity
eFigure 1. Gray matter differences for the contrast HC > BD
eFigure 2. Gray matter differences for the contrast HC > UD
eFigure 3. Discriminative maps for SVM classifier for the trained pattern in both MU and PI
eFigure 4. Overlap between the multivariate vector weight discriminative maps and the univariate results for the combined sample
eTable 1. Lifetime Comorbidities
eTable 2. Univariate VBM results (gray matter) for ANOVA
eTable 3. Univariate VBM results (gray matter) for group comparisons of HC vs BD and HC vs UD using the combined sample
eTable 4. Univariate VBM results (gray matter) for the comparison for BD and UD separated by site
eTable 5. Results of the pattern classification analyses regarding BD and UD using combined gray matter and white matter data
eTable 6. Results of the 3-group pattern classification performances using gray matter
eTable 7. Results of separate pattern classification analyses regarding BD vs HC and UD vs HC using gray matter
eTable 8. Feature weights of the SVM-classification UD vs BD for each site separately
Redlich R, Almeida JR, Grotegerd D, Opel N, Kugel H, Heindel W, Arolt V, Phillips ML, Dannlowski U. Brain Morphometric Biomarkers Distinguishing Unipolar and Bipolar DepressionA Voxel-Based Morphometry–Pattern Classification Approach. JAMA Psychiatry. 2014;71(11):1222-1230. doi:10.1001/jamapsychiatry.2014.1100
The structural abnormalities in the brain that accurately differentiate unipolar depression (UD) and bipolar depression (BD) remain unidentified.
First, to investigate and compare morphometric changes in UD and BD, and to replicate the findings at 2 independent neuroimaging sites; second, to differentiate UD and BD using multivariate pattern classification techniques.
Design, Setting, and Participants
In a 2-center cross-sectional study, structural gray matter data were obtained at 2 independent sites (Pittsburgh, Pennsylvania, and Münster, Germany) using 3-T magnetic resonance imaging. Voxel-based morphometry was used to compare local gray and white matter volumes, and a novel pattern classification approach was used to discriminate between UD and BD, while training the classifier at one imaging site and testing in an independent sample at the other site. The Pittsburgh sample of participants was recruited from the Western Psychiatric Institute and Clinic at the University of Pittsburgh from 2008 to 2012. The Münster sample was recruited from the Department of Psychiatry at the University of Münster from 2010 to 2012. Equally divided between the 2 sites were 58 currently depressed patients with bipolar I disorder, 58 age- and sex-matched unipolar depressed patients, and 58 matched healthy controls.
Main Outcomes and Measures
Magnetic resonance imaging was used to detect structural differences between groups. Morphometric analyses were applied using voxel-based morphometry. Pattern classification techniques were used for a multivariate approach.
At both sites, individuals with BD showed reduced gray matter volumes in the hippocampal formation and the amygdala relative to individuals with UD (Montreal Neurological Institute coordinates x = −22, y = −1, z = 20; k = 1938 voxels; t = 4.75), whereas individuals with UD showed reduced gray matter volumes in the anterior cingulate gyrus compared with individuals with BD (Montreal Neurological Institute coordinates x = −8, y = 32, z = 3; k = 979 voxels; t = 6.37; all corrected P < .05). Reductions in white matter volume within the cerebellum and hippocampus were found in individuals with BD. Pattern classification yielded up to 79.3% accuracy (P < .001) by differentiating the 2 depressed groups, training and testing the classifier at one site, and up to 69.0% accuracy (P < .001), training the classifier at one imaging site (Pittsburgh) and testing it at the other independent sample (Münster). Medication load did not alter the pattern of results.
Conclusions and Relevance
Individuals with UD and those with BD are differentiated by structural abnormalities in neural regions supporting emotion processing. Neuroimaging and multivariate pattern classification techniques are promising tools to differentiate UD from BD and show promise as future diagnostic aids.
Unipolar depression (UD) and bipolar disorders are leading causes of disability worldwide.1,2 Among patients with bipolar disorder, misdiagnosis rates of up to 75% are reported, primarily for UD,3,4 leading to insufficient treatment, poor outcome, and higher health care costs.5,6 There is a particular difficulty in distinguishing bipolar depression (BD) from UD, owing to their having the same diagnostic criteria for a depressive episode; the higher prevalence of depressive symptoms, rather than hypomanic symptoms, in bipolar disorder; and the presence of subthreshold manic symptoms for both disorders during a depressive episode.7,8 A major research goal is thus to identify neurobiological markers that can help differentiate these disorders, especially for individuals presenting during depressive episodes.
A growing number of neuroimaging studies reported functional and structural neural abnormalities in both disorders relative to healthy controls, pointing to functional and structural alterations in limbic and prefrontal cortical regions supporting emotion regulation in both disorders.9- 15 Unfortunately, very few studies directly compared individuals with UD with those with BD. Preliminary results obtained from functional imaging studies suggest differences in neural activation patterns to emotional stimuli, particularly in the amygdala,8,16- 19 and differences in blood flow at the anterior cingulate gyrus (ACG).20 Functional imaging studies are methodologically demanding, however, and can vary dramatically regarding paradigms and analytic methods. Furthermore, the majority of previous studies were performed at single sites, providing little information regarding the generalizability of findings across different sites.
In contrast, structural neuroimaging is more often performed routinely in clinical practice and is also more reliable across different scanning platforms. It is therefore potentially more suitable for providing measures that may yield future clinical applications for diagnostic purposes. Despite this, there are very few studies that have directly compared structural neuroimaging measures between individuals with UD and those with BD. The few available studies reported more white matter hyperintensities,21 decreased habenula volume,22 and decreased fractional anisotropy of the left longitudinal fasiculus23 and the corpus callosum24 in individuals with BD relative to those with UD, whereas other studies found no structural differences between groups.25- 27 To our knowledge, no study has analyzed the differences in whole-brain gray matter between individuals with UD and individuals with BD.
The goal of the present study was thus to identify the extent to which whole-brain gray matter abnormalities differentiated individuals with BD from those with UD and to replicate the findings in an independent data set. We hypothesized between-group differences in gray matter volume in neural regions supporting emotion regulation, including prefrontal cortical regions, the ACG, the amygdala, and the hippocampus, based on current neural models of emotion dysregulation in UD and bipolar disorder.8,12,13,28,29
Furthermore, a novel multivariate pattern classification approach was used to differentiate individuals with UD from those with BD based on gray matter, while training the classifier at 1 imaging site and testing in an independent sample at the other site. We hypothesized that similar patterns of gray matter abnormalities differentiating individuals with UD from individuals with BD could be determined independently at each site, and that pattern classification would provide diagnostic accuracy rates significantly different from chance levels, even when classifiers were trained at 1 site and tested at the other.
The present study included data from 2 independent sites. The Pittsburgh sample of participants was recruited from the Western Psychiatric Institute and Clinic at the University of Pittsburgh from 2008 to 2012. The Münster sample was recruited from the Department of Psychiatry at the University of Münster from 2010 to 2012. The final sample comprised 174 participants. Each site contributed a BD, UD, and healthy control group of 29 participants each. Hence, the final sample comprised 58 currently depressed individuals with BD, 58 currently depressed individuals with UD, and 58 healthy controls. The 3 groups were matched at each site on age (both P > .96) and sex (both P > .77). Our study was approved by the local institutional review board at each site and all participants provided written informed consent before study participation and the participants were financially compensated for their participation. In the combined sample, individuals with BD and those with UD were comparable regarding total years of education (P = .90), age at onset (P = .11), and current depression (determined by use of the Hamilton Depression Rating Scale30; P = .17), mania (determined by use of the Young Mania Rating Scale31; P = .93), or trait anxiety (determined by use of the State-Trait Anxiety Inventory32; P = .47), whereas both cohorts differed regarding several clinical parameters, including age at onsets, illness duration, number of depressive episodes, and Young Mania Rating Scale scores (all P < .04), when comparing the Münster sample with the Pittsburgh sample (Table 1). For comorbidities and exclusion criteria, see eTable 1 and eAppendix 1 in the Supplement.
The Münster and Pittsburgh data sets were acquired with a 3-T scanner using recently published protocols.33- 36 For a detailed description of data collection, voxel-based morphometry, and univariate analyses, see eAppendixes 2 and 3 in the Supplement.
Pattern recognition approaches represent a set of machine-learning–based algorithms, allowing multivariate, hypothesis-free differentiation of 2 or more groups based on high-dimensional data. A major strength of this technique is the individual-level prediction of group membership. Among the most frequently used classifiers in neuroimaging is the support vector machine (SVM), which has been used successfully in differentiating depressed patients from healthy controls15,37,38 and in differentiating individuals with UD from those with BD, using functional neuroimaging data.16,19,20 In addition, a second well-established classifier, the gaussian process classifier (GPC),39 was used to validate results with an independent algorithm.
Pattern classification analyses were conducted using the MANIA toolbox40 (https://bitbucket.org/grotegerd/mania) and implementing the LIBSVM (a library for support vector machines)41 and the GPC42 (eAppendix 4 in the Supplement). The preprocessed and smoothed modulated gray matter (and white matter) images were used as input for the classifiers. For all pattern classification analyses, an anatomical mask based on the neuroanatomical model of emotion regulation12,28,29 was used as previously described.19 The mask was created using the Wake Forest University PickAtlas43 according to the automated anatomical labeling definitions.44 This mask comprised the entire prefrontal cortex, including the middle cingulate gyrus, the ACG, the amygdala, the thalamus, the striatum and, additionally, the hippocampal formation. This large region of interest included 29.6% of the entire brain mask and was selected with the goal of including those neural regions most likely to substantially contribute to discriminative patterns in order to increase the performance of the classifiers.19 A feature ranking method using t tests based on a probability (P < .05) was embedded directly within the cross-validation process. The test margin, denoting the confidence with which individual participants were classified as belonging to 1 of the 2 diagnostic categories, was extracted from each participant to examine the influence of medication load on individual-level classification performance. The statistical significance was determined by probabilities of binominal distributions, as described in Fu et al.37
First, to examine our main objective, the abilities of the SVM and GPC to discriminate between individuals with UD and those with BD using gray matter data were evaluated at each site separately using a leave-one-subject-out cross-validation. Second, to determine whether the trained discriminating patterns from one site could be applied to the other site, the trained classifiers for the Münster sample were applied to the Pittsburgh sample (and vice versa). In addition, we performed the same classification procedures using all 3 groups (including healthy controls) and performed classifications between individuals with UD and healthy controls and between individuals with BD and healthy controls. Finally, we combined the data/features from gray matter and white matter to potentially increase the classification accuracy for the classifications of BD and UD. To compare the pattern classification feature weights (using discriminative maps) between both sites, the weight vectors of the SVM hyperplane were extracted. Furthermore, we compared the pattern classification feature weights with the results of the univariate analyses.
At both sites, the analysis of variance (F statistics) showed volume differences within the hippocampus, fusiform gyrus, amygdala, cerebellum, and prefrontal areas. Additional differences within the putamen, the caudate nucleus, and the lingual gyrus were found in the Münster sample. The analysis of variance of the combined sample from both sites confirmed these clusters (eTable 2 in the Supplement).
In both patient groups, reduced gray matter volumes were found in neural regions previously reported as showing abnormalities in individuals with bipolar disorder10,45 and UD.46,47 In individuals with UD, abnormalities included reduced gray matter volumes in the hippocampal formation, extending to the fusiform gyrus, the ventromedial and dorsomedial prefrontal cortex, the ACG, the caudate, and the precuneus. Compared with the healthy controls, the individuals with BD showed strong gray matter volume reductions in the bilateral hippocampus extending to the fusiform gyrus, lingual gyrus, amygdala, caudate, putamen, thalamus, insula, and dorsal prefrontal cortex (eTable 3 and eFigures 1 and 2 in the Supplement).
The whole-brain analysis of the Münster sample revealed significantly smaller gray matter volumes in individuals with BD relative to individuals with UD in 4 clusters comprising the bilateral hippocampal formation, the amygdala, the putamen, the insula, and the temporal pole (Figure, A). Conversely, only 1 cluster was observed with a highly significant volume reduction in the rostral ACG in individuals with UD relative to individuals with BD (Figure, B; eTable 4 in the Supplement).
In the Pittsburgh sample, no group differences were identified using the same rigorous correction for the entire brain. However, restricting the analysis to the significant clusters obtained from the Münster sample as a mask, and applying a correction for the smaller volume (voxel threshold of P < .01, and empirically determined cluster extent thresholds of k = 34 voxels for BD > UD, the direction where individuals with BD had more gray matter volume compared with individuals with UD, and k = 15 voxels for BD < UD, where individuals with BD had less gray matter volume compared with individuals with UD) by the use of Monte Carlo simulations (5000 iterations), using the AlphaSim48 procedure as implemented in REST (Resting-State fMRI Data Analysis Toolkit [http://restfmri.net/forum/index.php]), yielded significant overlapping results in the same direction as in the Münster sample, including a cluster in the left amygdala, the hippocampus, and the parahippocampal gyrus showing smaller gray matter volume in individuals with BD relative to those with UD and greater gray matter volume in the rostral ACG in individuals with BD relative to those with UD (eTable 4 in the Supplement).
Analysis of the combined sample confirmed smaller gray matter volume in individuals with BD than in individuals with UD within the entire bilateral hippocampal formation, amygdala, and thalamus, with minor involvement of the striatum and insula. Conversely, only 1 cluster in the rostral ACG cluster survived, showing greater gray matter volumes in individuals with BD than in individuals with UD (Table 2). Regarding the effects of medication load and clinical course (determined by the number of depressed episodes and the duration of illness), adding these variables to the design did not alter the pattern of results for the combined sample (for individuals with UD > BD: Montreal Neurological Institute [MNI] coordinates x = −15, y = −31, z = −11; t108 = 4.48; k = 420 voxels, corrected P < .001; MNI coordinates x = 18, y = −22, z = −12; t108 = 4.53; k = 241 voxels; corrected P < .001; for individuals with UD > BD: MNI coordinates x = −9, y = 32, z = −6; t108 = 5.65; k = 1032 voxels; corrected P < .001).The regression analyses between gray matter volume and clinical variables showed a significant negative association between illness duration and ACG volume (MNI coordinates x = −9, y = 34, z = 22; t53 = 5.05; k = 425 voxels; corrected P < .001) in the combined UD sample, a tendency that was observed in both samples (Münster sample: x = −8, y = 36, z = 21; t25 = 4.81; k = 173 voxels; Pittsburgh sample: MNI coordinates x = −14, y = 34, z = 21; t25 = 4.41; k = 41 voxels). There were no significant associations between gray matter volume and Young Mania Rating Scale or Hamilton Depression Rating Scale score and there were no associations whatsoever in the BD sample. For white matter results, see eAppendix 5 in the Supplement.
The SVM was able to differentiate individuals with UD from those with BD with an accuracy of 75.9% (P < .001) within the Münster sample and with an accuracy of 65.5% (P = .006) within the Pittsburgh sample. When using the GPC, highly similar rates were obtained within the Münster sample (79.3%; P < .001) and within the Pittsburgh sample (65.5%; P = .006).
Importantly, using these algorithms for training on the data set from one site and applying the trained classifier for testing at the other site yielded highly significant results: 69.0% accuracy (P < .001) when training on Pittsburgh data and testing on Münster data and 63.8% accuracy (P = .01) when training on Münster data and testing on Pittsburgh data (Table 3). There were no significant relationships (P > .05) between medication load and pattern classification test margins for accurately classifying individuals as belonging to either group. Combining the data/features from gray matter and white matter did not significantly increase the accuracy rates for the classifications of BD and UD (eTable 5 in the Supplement). For supplementary 3-group analyses, including healthy and separated analyses (individuals with UD vs healthy controls; individual with BD vs healthy controls), see eTables 6 and 7 in the Supplement. For a detailed description of sensitivity and specificity, see eAppendix 6 in the Supplement.
In both samples, the discriminative maps revealed strong overlapping negative feature weights within the hippocampus, amygdala, and prefrontal areas contributing importantly to the classification as a patient with BD, whereas positive feature weights, especially within the ACG, contributed to the classification as a patient with UD (eTable 8 and eFigure 3 in the Supplement). This corresponded well with our univariate results (overlap presented in eFigure 4 in the Supplement), albeit there were further important feature weights within the orbitofrontal areas, subgenual ACG, and lateral prefrontal areas that were not detected by univariate statistics (eTable 8 in the Supplement).
To our knowledge, the present study is the first to characterize differences between individuals with UD and individuals with BD in gray matter across the entire brain. The principal findings indicate brain morphological differences between individuals with UD and individuals with BD in the amygdala, hippocampus, and ACG, replicated at an independent site. Furthermore, with multivariate pattern classification, we were able to differentiate both patient groups, even across different sites and scanners.
Our results suggest smaller gray matter volume in the ACG, especially within the rostral part, in individuals with UD compared with individuals with BD and healthy controls. Reductions in gray matter volume in the amygdala were only found in individuals with BD compared with individuals with UD and healthy controls, whereas both patient groups showed a reduced hippocampal volume compared with healthy controls.
These regions are known to be important for emotion processing and regulation.12 A number of structural studies49- 51 have reported decreased gray matter volumes in the ACG in individuals with UD vs healthy controls and, recently, meta-analyses concluded that volume reduction of the ACG is one of the most consistent findings in structural studies investigating UD9,11 whereas the literature for BD seems more inconclusive.52- 54 The ACG has dense connections to prefrontal areas55 and subcortical regions, such as the amygdala.56,57
First, regarding ACG function, ACG activation has been reported during the downregulation of emotions58 and, in particular, the rostral ACG is thought to be involved in automatic emotion regulation, as well as in the identification of emotionally salient stimuli and the mediation of autonomic responses associated with the generation of an emotional state.12 Second, the ACG is involved in self-referential processing59 and in individuals with UD, with negative self-referential thoughts and rumination.60
Correspondingly, studies of individuals with UD reported that ruminations correlated negatively with ACG volume61 and some have found hyperactivity in the rostral ACG during self-referential processing of negative words.62,63 Because individuals with UD were reported to show more automatic negative self-statements and rumination than individuals with BD,64 abnormally reduced ACG gray matter volume in individuals with UD may underlie or mediate the inability to regulate automatic negative self-referential processing and accompanied feelings. In other words, an increased self-focus toward emotional responses might correspond to reductions in gray matter volume in individuals with UD compared with individuals with BD. In addition, discriminative maps of the pattern classification analyses showed strong feature weights within subgenual parts of the ACG contributing specifically to the classification of individuals with UD vs individuals with BD. A number of previous studies reported the importance of the subgenual ACG in the development and treatment of mood disorders65,66 and it has been shown that resting blood flow in the subgenual ACG significantly discriminates individuals with UD from those with BD.20
Abnormalities in the activity and volume of the amygdala and hippocampus have also been repeatedly reported in individuals with UD and individuals with BD compared with healthy controls.10,67- 74 Recently, 3 studies16- 18 reported that the responsiveness of the amygdala to emotional stimuli differentiated individuals with UD from individuals with BD. Furthermore, individuals with BD and individuals with UD were differentiated on the basis of connectivity patterns between the amygdala and the ventromedial prefrontal cortex.75 Together with the present results, these studies provide support for a key role of the amygdala in differentiating UD from BD, even if the underlying mechanisms remain to be clarified.13,75,76 In contrast, the hippocampal differences found between UD and BD should not always be assumed because volume decreases were found in both patient groups compared with healthy controls, and thus these hippocampal differences could potentially fail to discriminate between the 2 patient groups. These differences could be confounded by illness severity or medication. Taken together, the amygdala and the subgenual ACG may be particularly useful potential neuroanatomical biomarkers for differentiating UD and BD.
Our pattern classification approach reached up to 79% accuracy when training and testing in one sample, a figure much in line with other recent studies using this approach on functional neuroimaging data.16,19,20,77,78 Even using classifiers trained at one site and then used to test the other, the independent data set resulted in classification results that were still significantly different from chance levels (69%; P < .001). Although a performance of less than 70% accuracy may not be regarded as substantial enough to justify a clinical application, it must be emphasized that, to our knowledge, this is the first use of pattern classification algorithms with classifiers trained on a reference sample to data from an independent sample measured on a different scanner platform. Besides case-by-case classification, it should be further emphasized that pattern classification analyses generally are able to detect more subtle, spatially distributed patterns than univariate analyses, contributing to a richer differentiation between groups.
Together, these findings suggest that the combination of multivariate pattern classification and structural neuroimaging has the potential to be used clinically, potentially addressing the difficulty in discriminating between UD and BD. Future studies may benefit from including more information in the classifiers (eg, diffusion imaging data, genotypes, and clinical variables).
There were some limitations to our study. The results from the Pittsburgh sample did not survive the AlphaSim correction for the entire brain but the results from the clusters obtained from the Münster sample did. Additional pattern classification was less accurate in the Pittsburgh sample although it was still highly significant compared with chance levels. Differences regarding sample characteristic may explain these discrepancies (eg, older age and more chronically ill patients in the Münster sample, which could have led to more pronounced morphometric differences between groups at the Münster site).
Second, as expected, medication load was higher for individuals with BD than for those with UD, although medication load did not alter the findings from univariate analyses, similar to previous studies.79 Furthermore, medication load had no significant influence on pattern classification test margins for accurately classifying individuals as belonging to either group.
Finally, the samples at the 2 sites differed regarding age and clinical variables, which may have impaired the performance of the classifier when applied at the other site. Matching on illness criteria (such as the duration of illness or the number of episodes, as well as the potential influence of treatment history) should be considered in future studies using this approach.
Although UD and BD often remain very difficult to distinguish in clinical practice, promising findings from studies using pattern classification with neuroimaging techniques can help to provide biomarkers to correctly identify individuals with either diagnosis at an earlier stage. Furthermore, our study showed, for the first time, to our knowledge, that it is feasible to transfer a trained classifier from one sample to another, independent sample.
Submitted for Publication: January 23, 2014; final revision received April 22, 2014; accepted May 18, 2014.
Corresponding Author: Udo Dannlowski, MD, PhD, Department of Psychiatry, University of Münster, Albert Schweitzer Campus 1, Gebäude 9A, 48149 Münster, Germany (email@example.com).
Published Online: September 3, 2014. doi:10.1001/jamapsychiatry.2014.1100.
Author Contributions: Dr Dannlowski had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Mr Redlich and Dr Almeida contributed equally to the work and should therefore both be regarded as first authors.
Study concept and design: Redlich, Almeida, Kugel, Phillips, Dannlowski.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: Redlich, Almeida, Phillips, Dannlowski.
Critical revision of the manuscript for important intellectual content: Almeida, Grotegerd, Opel, Kugel, Heindel, Arolt, Phillips, Dannlowski.
Statistical analysis: Redlich, Almeida, Dannlowski.
Obtained funding: Phillips.
Administrative, technical, or material support: Redlich, Grotegerd, Opel, Kugel, Heindel, Arolt, Phillips, Dannlowski.
Study supervision: Almeida, Kugel, Phillips, Dannlowski.
Conflict of Interest Disclosures: None reported.
Funding/Support: The study was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft; grants FOR 2107 and WP1 to Dr Dannlowski).
Role of the Funder/Sponsor: The German Research Foundation had no role in the design and conduct of the study; collection, management, analysis, or interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Additional Contributions: We thank Carlos Zevallos, an undergraduate student at the School of Medicine, University of Pittsburgh, for his helpful comments on earlier versions of the manuscript. His contribution for this study was not funded.
Correction: This article was corrected on November 10, 2014, for misspelled name in the byline.