Schematic representation of nonlinear support vector machine (SVM) classification (A) and large-margin classification (B). A, At left, 2 groups of individuals (red and green shapes) cannot be separated in the input space by a linear classifier because the relationship between the data instances and their class labels is nonlinear (black circle). At right, With the use of radial basis functions, the data can be mapped into a high-dimensional space where the groups can be separated by means of linear classification. The shaded shapes represent the support vectors that define the optimal separating hyperplane (OSH) (yellow). B, At left, infinite separating boundaries (dotted lines) may exist between 2 classes (red and green circles). At right, the SVM algorithm determines the OSH by maximizing the margin between the nearest data instances of opposite classes.
Discriminative patterns of the healthy control group 1 (HC1) vs at-risk mental state, early (ARMS-E) vs at-risk mental state, late (ARMS-L) classification analysis. See the “Methods” section for an explanation of the visualization technique. Warm and cool colors represent volumetric reductions and increments, respectively, in the second vs the first group of the binary classifier. The units are gray matter volume residuals (after removing the effects of age and sex by means of partial correlation analysis and after scaling to a range of [−1, 1]). The gray matter volume reduction scales differed between HC1 vs ARMS-E (A), HC1 vs ARMS-L (B), and ARMS-E vs ARMS-L (C), with the largest effects being observed in the HC1 vs ARMS-L classifier and the most subtle differences being present in the discriminative pattern of ARMS-E vs ARMS-L.
Discriminative patterns of the healthy control group 2 (HC2) vs at-risk mental state (ARMS) with disease transition (ARMS-T) vs ARMS without disease transition (ARMS-NT) classification analysis. A, HC2 vs ARMS-T. B, HC2 vs ARMS-NT. C, ARMS-T vs ARMS-NT. See the “Methods” section for an explanation of the visualization technique. Warm and cool colors represent volumetric reductions and increments, respectively, in the second vs the first group of the binary classifier.
Koutsouleris N, Meisenzahl EM, Davatzikos C, Bottlender R, Frodl T, Scheuerecker J, Schmitt G, Zetzsche T, Decker P, Reiser M, Möller H, Gaser C. Use of Neuroanatomical Pattern Classification to Identify Subjects in At-Risk Mental States of Psychosis and Predict Disease Transition. Arch Gen Psychiatry. 2009;66(7):700–712. doi:10.1001/archgenpsychiatry.2009.62
Identification of individuals at high risk of developing psychosis has relied on prodromal symptomatology. Recently, machine learning algorithms have been successfully used for magnetic resonance imaging–based diagnostic classification of neuropsychiatric patient populations.
To determine whether multivariate neuroanatomical pattern classification facilitates identification of individuals in different at-risk mental states (ARMS) of psychosis and enables the prediction of disease transition at the individual level.
Multivariate neuroanatomical pattern classification was performed on the structural magnetic resonance imaging data of individuals in early or late ARMS vs healthy controls (HCs). The predictive power of the method was then evaluated by categorizing the baseline imaging data of individuals with transition to psychosis vs those without transition vs HCs after 4 years of clinical follow-up. Classification generalizability was estimated by cross-validation and by categorizing an independent cohort of 45 new HCs.
Departments of Psychiatry and Psychotherapy, Ludwig-Maximilians-University, Munich, Germany.
The first classification analysis included 20 early and 25 late at-risk individuals and 25 matched HCs. The second analysis consisted of 15 individuals with transition, 18 without transition, and 17 matched HCs.
Main Outcome Measures
Specificity, sensitivity, and accuracy of classification.
The 3-group, cross-validated classification accuracies of the first analysis were 86% (HCs vs the rest), 91% (early at-risk individuals vs the rest), and 86% (late at-risk individuals vs the rest). The accuracies in the second analysis were 90% (HCs vs the rest), 88% (individuals with transition vs the rest), and 86% (individuals without transition vs the rest). Independent HCs were correctly classified in 96% (first analysis) and 93% (second analysis) of cases.
Different ARMSs and their clinical outcomes may be reliably identified on an individual basis by assessing patterns of whole-brain neuroanatomical abnormalities. These patterns may serve as valuable biomarkers for the clinician to guide early detection in the prodromal phase of psychosis.
The first manifestation of psychosis constitutes the most active disease phase, affecting the individual at both environmental and neurobiological dimensions.1 Neurotoxic processes may underlie this disease phase and may drive clinical deterioration, leading ultimately to the disabling, chronic state of the disorder.2 Therefore, the duration of untreated psychosis may have a critical effect on the long-term clinical outcome in terms of the responsiveness to medical treatment, frequency of hospitalizations, and social and cognitive functioning.3,4 Thus, the clinical focus has increasingly shifted to the early recognition and treatment of individuals in an at-risk mental state (ARMS) of psychosis to postpone or even prevent the onset of the disease.5- 7
Early recognition relies on valid diagnostic markers that facilitate the detection of disease-related signals in heterogeneous, subclinical populations. In this regard, clinical studies of individuals with ARMS have identified patterns of subtle experiential and behavioral abnormalities consisting of affective and basic symptoms as well as attenuated psychotic symptoms, which are frequently paralleled by deteriorating social functioning.8- 11 Currently, the detection of individuals with ARMS and the determination of the risk of disease transition depends on this subclinical symptomatology.
However, the overlap between prodromal symptoms and psychopathological phenomena found in the general population12,13 challenges the reliable delineation of the ARMS. Thus, the low predictive validity of single prodromal symptoms limits their use as diagnostic markers for the purpose of early recognition at the individual level.14 Moreover, the accurate detection of subtle clinical abnormalities demands skilled personnel in highly specialized mental health services. Therefore, suitable biological markers may enhance the early recognition of emerging psychosis. In this context, recent neuroimaging studies showed structural alterations in a number of brain regions, suggesting that the prodromal state is associated with patterns of subtle gray matter (GM) abnormalities within the temporal and frontal cortices, the limbic system, and the cerebellum.15- 21
The diagnostic utility of these alterations in the clinical treatment of single individuals with ARMS is limited because (1) the expression of structural abnormalities may strongly depend on the individual neurobiological vulnerability and (2) neuroanatomical parameters derived from group-level neuroimaging studies show a considerable between-group overlap.22 These limitations may be surmounted by a methodological shift to multivariate machine learning techniques. In this context, support vector machines (SVMs)23 emerged as a powerful tool in a wide range of biomedical applications because of their ability to learn the categorization of complex, high-dimensional training data and to generalize the learned classification rules to unseen data.24 Recent studies demonstrated the utility of SVMs in the neuroanatomical classification of Alzheimer disease and schizophrenia.25- 29
Because SVMs have not been applied to the magnetic resonance (MR) imaging–based diagnostic evaluation of individuals with ARMS, we investigated their ability to detect different ARMSs by performing a classification of healthy controls (HCs) vs individuals with ARMS grouped into “early” or “late” high-risk samples (ARMS-E or ARMS-L). This 2-stage conceptualization of the ARMS30,31 has been supported by recent neurocognitive, neurophysiological, and structural brain findings.32- 35 Furthermore, the SVMs' performance in predicting disease transition was evaluated in an ARMS subgroup having clinical follow-up information. This sample was divided into individuals with and without disease transition (ARMS-T and ARMS-NT), who were categorized relative to each other and to HCs. The classifiers' performance was evaluated by means of 5-fold cross-validation and by classifying an independent sample of HCs. We expected the individuals with ARMS-L and ARMS-T to be classified with higher accuracy than those with ARMS-E and ARMS-NT.
Forty-five individuals with ARMS (28 men and 17 women with a mean [SD] age of 25.1 [5.8] years) were recruited at our Early Detection and Intervention Centre for Mental Crises, Ludwig-Maximilians-University. Potential individuals with ARMS were referred to the center by primary health care services and were examined according to a standardized inclusion criteria checklist with operationalized definitions of different types of prodromal symptoms: basic symptoms taken from the Bonn Scale for Assessment of Prodromal Symptoms10 and attenuated psychotic (APSs) and brief limited intermittent psychotic symptoms (BLIPSs) as defined by the Personal Assessment and Crisis Evaluation (PACE) criteria.9 The recruitment protocol has been detailed previously.15 In summary, potential individuals with ARMS meeting defined sets of state and/or trait markers were included in the study. Inclusion based on global functioning and trait factors required a 30-point or greater reduction in the DSM-IV Global Assessment of Functioning Scale and (1) a familial history of psychotic disorders in the first-degree relatives or (2) a personal history of prenatal or perinatal complications. Inclusion based on psychopathological state markers required at least 1 positive item in the basic symptom, APS, or BLIPS categories of the inclusion criteria checklist following specific time and duration criteria (Box).15,32,37
ARMS-E: Individuals With ARMS Without APSs and/or BLIPSs
1. Individuals had ≥1 of the following basic symptoms appearing within 12 months before study inclusion and several times per week during the past 3 months:
• Thought interferences
• Thought perseveration
• Thought pressure
• Thought blockages
• Disturbances of receptive language, either heard or read
• Decreased ability to discriminate between ideas and perception, fantasy and true memories
• Unstable ideas of reference (subject-centrism)
• Visual perception disturbances
• Acoustic perception disturbances
2. Individuals showed a reduction in Global Assessment of Functioning Scale score (DSM-IV) of ≥30 points (within the past year) combined with ≥1 of the following trait markers:
• First-degree relative with a lifetime diagnosis of schizophrenia or a schizophrenia spectrum disorder
• Prenatal or perinatal complications
1. Individuals had ≥1 of the following APSs within the past 3 months, appearing several times per week for a period of ≥1 week:
• Ideas of reference
• Odd beliefs or magical thinking
• Unusual perceptual experiences
• Odd thinking and speech
• Suspiciousness or paranoid ideation
2. Individuals had ≥1 BLIPS, defined as the appearance of 1 of the following psychotic symptoms for <1 week (interval between episodes ≥1 week), resolving spontaneously:
• Formal thought disorder
• Gross disorganized or catatonic behavior
Abbreviations: APSs, attenuated psychotic symptoms; ARMS, at-risk mental state; ARMS-E, early ARMS; ARMS-L, late ARMS; BLIPSs, brief limited intermittent psychotic symptoms.
aAdapted from Häfner et al.36
The ARMS cohort was dichotomized according to a 2-stage conceptualization of the ARMS distinguishing between nonpsychotic ARMS-E, with an increased risk of psychosis, and psychotic ARMS-L, with an imminent risk of full-blown psychosis.30,31 The ARMS-E sample consisted of individuals without APSs and BLIPSs who had had at least 1 basic symptom (Box) several times within the past 3 months, appearing first at least 12 months before study inclusion, and/or who met a global functioning and trait criterion. Following the PACE criteria,9,38 the ARMS-L sample comprised individuals with at least 1 APS within the past 3 months, appearing several times per week, and/or with at least 1 BLIPS, spontaneously resolving within 1 week. Basic symptoms and/or global functioning and trait markers were not exclusion criteria for this sample. In addition, prodromal symptomatology was rated by means of the Positive and Negative Syndrome Scale and Montgomery-Åsberg Depression Rating Scale.39,40
Individuals with ARMS were regularly followed up for 4 years to detect shifts toward a different ARMS or a transition to psychosis.41 In individuals meeting the transition criteria, schizophrenia spectrum disorders were diagnosed according to the International Statistical Classification of Diseases, 10th Revision diagnostic research criteria at the time of transition and after 1 year. Exclusion criteria were assessed by obtaining the personal and familial history by means of a semistructured clinical interview and involved (1) disease transition; (2) a past or present diagnosis of schizophrenia spectrum and bipolar disorders, as well as delirium, dementia, amnestic or other cognitive disorders, mental retardation, and psychiatric disorders due to a somatic factor, following the DSM-IV criteria; (3) alcohol or drug abuse within 3 months before examination; (4) past or present inflammatory, traumatic, or epileptic diseases of the central nervous system; and (5) any previous treatment with antipsychotics.
For the first analysis (HC vs ARMS-E vs ARMS-L), we randomly selected a sample of 25 HCs (HC1) from a previously described group of 75 HCs15 to create a balanced design regarding group sizes. The HC1 group was matched groupwise for age, handedness, and years of education to the ARMS-E and ARMS-L samples. A second matched HC sample (HC2) was used for the second analysis (HC vs ARMS-T vs ARMS-NT) by randomly removing 8 subjects from HC1. Finally, 45 new HCs (HCnew group) were recruited for the external validation of these classification analyses. Any HCs with a past or present personal or familial history of neuropsychiatric conditions were excluded from the study. All participants provided their written informed consent before study inclusion. The study was approved by the local research ethics committee.
The MR images were obtained on a 1.5-T system (Magnetom Vision; Siemens, Erlangen, Germany). Imaging was performed with a T1-weighted 3-dimensional magnetization prepared rapid-acquisition gradient echo sequence (repetition time, 11.6 milliseconds; echo time, 4.9 milliseconds; field of view, 230 mm; matrix, 512 × 512; 126 contiguous axial sections of 1.5-mm thickness; and voxel size, 0.45 × 0.45 × 1.5 mm).
After inspection for artifacts and gross abnormalities, the images were segmented into GM, white matter, and cerebrospinal fluid tissue maps in native space by means of the VBM5 toolbox (http://dbm.neuro.uni-jena.de), an extension of the SPM5 software package (Wellcome Department of Cognitive Neurology, London, England). Details of this segmentation protocol have been described previously.15 The estimated tissue maps of each individual were combined into a single volume with the values of GM, white matter, and cerebrospinal fluid set to 150, 250, and 10, respectively. Then, an established high-dimensional normalization algorithm42,43 registered these volumes to the single-subject brain template of Montreal Neurological Institute. This elastic warping algorithm compensates for the interindividual anatomical variation by establishing point correspondences between cortical, subcortical, and ventricular structures, thus achieving a better alignment of corresponding anatomical regions than the standard SPM5 normalization. The anatomical information is encoded in the volumetric changes occurring during normalization and is applied to the registered tissue maps, allowing for a regional analysis of volumes in normalized space (RAVENS). Similar to the “modulation” step used in voxel-based morphometry,44 RAVENS maps allow for local comparisons in standard space that are equivalent to volumetric comparisons of the original tissue maps in native space.45 The individual GM-RAVENS maps were proportionally scaled to the global GM volume computed from the native tissue maps. The effects of age and sex were removed from the data by calculating the partial correlations between these variables and the images.
We applied principal component analysis (PCA) to the proportionally scaled, age- and sex-corrected GM-RAVENS maps by projecting the number of correlated voxels to a number of uncorrelated principal components (PCs).46,47 The PCA reduces (1) the computational complexity of classification caused by the high dimensionality of MR imaging data and (2) the generalization error of classification by optimizing the number of PCs for data projection, thus maximizing the degree of anatomical information while minimizing the impact of noise.47 This filtering effect of PCA increases both sensitivity and specificity of multivariate pattern recognition techniques compared with gaussian smoothing, which improves sensitivity at the cost of anatomical specificity.
The optimal PC number for data projection was determined by the peak overall classification accuracy across the whole range of possible PC numbers as defined by the respective population size.47 Peak cross-validated classification accuracies were observed at nPC = 21/nPC = 17 in the first/second SVM analysis. In addition, the effect of PCA on classification performance was evaluated in the second SVM analysis by skipping the dimensionality reduction step before classification (eTable 1 and eTable 2, http://www.archgenpsychiatry.com). Before the class membership of the test subjects was predicted, the mapping parameters computed for the PCA projection of the training data were applied to the test data. The PCA was performed by means of the Dimensionality Reduction Toolbox.48
The SVMs are multivariate artificial learning algorithms applied recently to the MR imaging–based classification of neuropsychiatric patient populations.26- 29,46,49 They represent supervised machine learning procedures in that they (1) learn about group differences in a training data set categorized by some a priori knowledge and (2) apply the learned model to the classification of new data.23,50,51
From the perspective of statistical learning theory, MR images can be regarded as points in a high-dimensional space. In our case, the dimensionality of this space was determined by the optimal number of uncorrelated PCs obtained by PCA. The SVM analysis started with a nonlinear transformation from the low-dimensional space of the individuals' PC loadings to a high-dimensional feature space. Nonlinear kernel transformations may have important advantages over linear mappings because they can handle classification problems with nonlinear relations between class labels and data instances (Figure 1A). We used the radial basis functions kernel because it facilitates the adaptive modeling of the interface between the classes and thus significantly improves classification performance.52,53 Intuitively, the kernel matrix can be regarded as a similarity measure, meaning that data instances sharing similar features form clusters within the feature space.
The SVMs implement the principle of structural risk minimization23,50,51 to learn a classification rule that guarantees generalizability to unknown data instances, avoiding both model overfitting and underfitting. Structural risk minimization is achieved through large margin classification, which determines the optimal separating hyperplane (OSH) between the training classes by maximizing the distance between the nearest data instances of opposite classes (Figure 1B). These instances are the support vectors because they show the smallest distance to the OSH. Instances further away from the OSH do not contribute to the discrimination. Thus, the algorithm focuses on subtle between-group differences and not on gross, easily detectable anatomical features.
The OSH can be used to predict the class membership of new data instances. For each new instance, the classifier produces an output consisting of the predicted class membership and the decision value measuring the distance of the new instance to the OSH. These decision values were used for constructing multiclass classifiers in which the class label with the maximum absolute decision value across the 3 binary SVMs decided the class membership (1-vs-1-by-maximum-wins method). We used the LIBSVM software for our SVM analysis.54
Although the SVM classifiers used in our analysis are effective in detecting spatially complex and subtle patterns of neuroanatomical between-group differences, they are difficult to visualize because of the nonlinearity of the classification method.53 Discriminative neuroanatomical patterns were approximated by the following visualization technique: for all binary SVMs, the “nearest-neighbor” support vector pairs were determined by selecting support vectors from opposite classes that were separated by the smallest distance across the OSH. Then, the difference vector for each of these nearest-neighbor support vector pairs was calculated and the arithmetic mean of all difference vectors was formed. Finally, the PCs computed during the PCA procedure were weighted by the mean difference vector and summed to reconstruct the discriminative volume in the original space of the GM-RAVENS maps. For visualization purposes, the complexity of the discriminative patterns was reduced by smoothing the volumes with an isotropic gaussian kernel of 8 mm full width at half maximum. The patterns of the 3 binary classifiers were overlaid on the Montreal Neurological Institute single-subject anatomical template by means of the software package MRIcron (http://www.sph.sc.edu/comd/rorden/mricron/).
First, 5-fold cross-validation was performed to estimate the generalizability of the classification models.55 Therefore, the study population was split into 5 nonoverlapping samples, and each of these was iteratively held back as test data while the classifier was trained on the 4 remaining samples. In each iteration the class membership of the test data instances, which were unseen by the algorithm, was predicted by using the classifier constructed from the training data. Five-fold cross-validation provides a more conservative estimate of generalizability than leave-1-out cross-validation,56 which iteratively predicts the class membership of only 1 test case against the rest of the population.28,56,57 See also eTable 3 and eTable 4. Sensitivity, specificity, accuracy, false-positive rate, and positive (PPV) and negative predictive value (NPV) of cross-validation were computed for all binary and multiclass classifiers. Then, permutation testing was used to estimate the likelihood of obtaining classification performance by chance, meaning that the discriminative pattern between the data happened to correlate with the membership labels as an artifact of small sample sizes.58 Therefore, the null distribution of the classification error was constructed for each classifier by performing 5000 random permutations of the membership labels and applying 5-fold cross-validation to each of these permutations. The null hypothesis that the classifier did not predict the test cases' class was rejected at α = .05. Finally, the external validity of both multiclass classifiers was evaluated by predicting the class membership of HCnew.
Table 1 and Table 2 summarize the sociodemographic, clinical, and global anatomical characteristics of the study populations. No significant differences with respect to age, sex, handedness, years of education, and global brain volumes were found between the HC and ARMS samples except for age, which differed significantly between ARMS-T and ARMS-NT (Table 1). Reduced global functioning did not differ between ARMS-E and ARMS-L, but all 15 individuals in the ARMS-T group showed a reduction of 30 points or more on the Global Assessment of Functioning Scale at study inclusion compared with 56% in the ARMS-NT sample (Table 2). The ARMS samples were not significantly different with respect to the prevalence of psychosis in the first-degree relatives or prenatal and perinatal complications (Table 2).
No significant differences were detected between ARMS-E and ARMS-L regarding Positive and Negative Syndrome Scale and Montgomery-Åsberg Depression Rating Scale scores (Table 2). The ARMS-T group scored significantly higher on the Positive and Negative Syndrome Scale positive symptoms score and showed a trend toward a lower total Montgomery-Åsberg Depression Rating Scale score. The overall prevalence of cognitive basic symptoms (thought interference, thought perseveration, thought pressure, and thought blockages) was higher in the ARMS-L group than the ARMS-E group and in the ARMS-T group relative to the ARMS-NT group. The ARMS-T group showed a significantly higher prevalence of APSs and BLIPSs than the ARMS-NT group at baseline (Table 2). In the ARMS-T sample, the mean time to transition was 188 days (range, 35-777 days). Thirteen individuals developed psychosis during the first year of follow-up, 1 person in the second year, and 1 person in the third year.
The permutation analysis showed that the classification models produced by all binary and multiclass SVM classifiers of our study were significant at P < .001. See also eTables 1 through 5.
The overall accuracy of the 3-group classifier was 81% (Table 3). Of 25 HC1 individuals, 3 individuals were mislabeled as having ARMS-E and 2 as having ARMS-L (sensitivity, specificity, and accuracy for HC1 vs the rest: 80%, 89%, and 86%, respectively). Two individuals with ARMS-E were mislabeled as having ARMS-L (ARMS-E vs the rest: 90%, 92%, and 91%, respectively). Of 25 individuals with ARMS-L, 5 were misclassified as being in the HC1 group and 1 as having ARMS-E (ARMS-L vs the rest: 76%, 91%, and 86%, respectively). Among the binary classifiers, the highest accuracy of 87% (sensitivity, specificity, PPV, and NPV: 95%, 80%, 79%, and 95%, respectively) was observed for the classification of HC1 vs ARMS-E, followed by 82% (84%, 80%, 84%, and 80%, respectively) for the ARMS-E vs ARMS-L classification and 78% (76%, 80%, 79%, and 77%, respectively) for the HC1 vs ARMS-L classification (Table 4).
The discriminative pattern underlying the HC1 vs ARMS-E classification consisted of GM volume reductions within the cerebellum, thalamus, and prefrontal cortex bilaterally (Figure 2A). Further bilateral reductions were found in the medial occipital areas and precuneus, as well as in the lateral temporal lobe, including the middle and superior temporal gyrus and extending into the right supramarginal gyrus. Gray matter volume increments were detected within the inferior temporal lobes and lateral parietal cortices bilaterally. The HC1 vs ARMS-L classifier relied on more extended and pronounced bilateral GM volume reductions in the cerebellum, precuneus, and supplementary motor areas, including the dorsomedial prefrontal cortex and anterior cingulate gyrus (Figure 2B). Further bilateral reductions were identified within the basal ganglia, orbitofrontal cortex, medial temporal lobes, and insula, as well as in the anterior portions of the superior temporal gyrus. The ARMS-E vs ARMS-L classification involved bilateral GM volume reductions occupying the anterior and posterior portions of the cingulate gyrus, the posterior part of the superior and middle temporal gyri with extensions into the inferior parietal lobule, the orbitofrontal and ventrolateral prefrontal cortex, and the cerebellum (Figure 2C).
The overall accuracy of the 3-group classifier was 82% (Table 3). Two of the 17 individuals in the HC2 group were misclassified as having ARMS-T and 1 as having ARMS-NT (sensitivity, specificity, and accuracy for HC2 vs the rest: 82%, 94%, and 90%, respectively). Two individuals with ARMS-T were wrongly assigned to the ARMS-NT group (ARMS-T vs the rest: 87%, 89%, and 88%, respectively). Of the 18 individuals with ARMS-NT, 2 individuals were mislabeled as being in the HC2 group and 2 as having ARMS-T (ARMS-NT vs the rest: 78%, 91%, and 86%, respectively). The binary classification of HC2 vs ARMS-T attained the highest performance (accuracy, sensitivity, specificity, PPV, and NPV: 94%, 100%, 88%, 88%, and 100%, respectively), followed by the classification of HC2 vs ARMS-NT (86%, 78%, 94%, 93%, and 80%, respectively) and ARMS-T vs ARMS-NT (82%, 83%, 80%, 83%, and 80%, respectively) (Table 4). In our additional analysis of the effect of dimensionality reduction on classification performance, we obtained significantly lower classification accuracies (HC2 vs ARMS-T, 75%; HC2 vs ARMS-NT, 51%; ARMS-T vs ARMS-NT, 70%; and 3-group classification, 60%) (eTables 1 and 2).
The HC2 vs ARMS-NT and HC2 vs ARMS-T classifications relied on similar GM volume reduction patterns occupying the anterior and posterior cingulate cortex; the orbitofrontal, lateral prefrontal, and inferior temporal cortex; and the medial temporal lobe and caudate nuclei bilaterally (Figure 3A and B). The discriminative pattern of HC2 vs ARMS-T was more extended compared with HC2 vs ARMS-NT. Finally, the ARMS-NT vs ARMS-T classifier detected a pattern of GM volume reductions involving the medial, lateral, and inferior temporal cortices, as well as the lateral prefrontal areas, the thalamus, and the cerebellum (Figure 3C).
To our knowledge, this is the first study to evaluate the feasibility of early recognition and disease prediction in individuals with ARMS by using multivariate neuroanatomical pattern classification. We were able to distinguish individuals with ARMS from HCs and to detect their ARMS with high diagnostic accuracy by relying solely on structural between-group differences. Furthermore, our study provided evidence that SVMs could be developed to predict transition to psychosis.
Our method's performance is comparable to that of previous neuroimaging studies that used SVMs for the diagnostic classification of Alzheimer disease, frontotemporal degeneration, and mild cognitive impairment.26,28,49 Furthermore, MR imaging–based SVMs have been successfully applied to the categorization of schizophrenic patients29 and their healthy relatives.27 These studies demonstrate that MR imaging–based SVMs reliably separate different nosological populations at the individual level, suggesting good performance also in subclinical conditions.
In our first SVM analysis (HC1 vs ARMS-E vs ARMS-L), we observed a high cross-validated classification performance with the use of 2- and 3-group classifiers. The individuals with ARMS-L were recruited according to established ultrahigh-risk criteria (UHR), which were sensitive to an imminent risk of disease transition.7,8,38,41,59 In this context, the cross-sectional and longitudinal clinical data of the ARMS-L sample (Table 1) were comparable to those of other UHR populations.7,8,38,41,59 The high diagnostic accuracy of the ARMS-L classifier suggests that structural patterns involving prefrontal, orbitofrontal, perisylvian, limbic, and cerebellar abnormalities may be associated with an imminent risk of full-blown psychosis in keeping with the results of previous investigations.15- 19,21,60 These studies showed that (1) UHR individuals show subtle structural brain abnormalities in similar brain regions as in patients with manifest schizophrenia and (2) subsequent conversion to psychosis may be associated with spatially more extended alterations at baseline and further progressive abnormalities during disease transition. Within this framework, our results suggest that neuroanatomical biomarkers of the UHR state could be integrated in future high-risk studies, as recently proposed.8
Contrary to our initial expectations, the individuals in the ARMS-E group were correctly assigned to their clinical group with an even higher accuracy, which may be owing to greater clinical heterogeneity of ARMS-L individuals caused by a significantly higher rate of disease transitions. Common psychopathological criteria are generally not sensitive to the early prodromal state because it largely overlaps with depressive syndromes61 and nonspecific psychopathological phenomena found in the general population.12 However, Klosterkötter et al10 found a conversion rate of nearly 50% in 160 individuals with ARMS selected for an early prodromal state by the presence of perceptual-cognitive “basic symptoms.”62 In contrast, our ARMS-E conversion rate was 5.6%, although 10 highly predictive basic symptoms had been used as intake criteria (Box). This inconsistency may be due to our relatively small ARMS-E sample and the significantly shorter follow-up period. Recent findings suggest that different types of prodromes may exist, with 33% of the converters having prodromal phases lasting more than 6 years.63
Because of the low conversion rate of our ARMS-E sample, we do not know whether MR imaging–based SVMs would facilitate the prediction of a later disease transition in a putatively early prodromal stage of psychosis. Therefore, ARMS-E may rather be conceptualized as a precursor of psychosis64 that is marked by increased disease vulnerability but that does not necessarily lead to a subsequent disease manifestation. Previous MR imaging studies of genetically defined at-risk individuals reported brain abnormalities in limbic and paralimbic structures17,18,21,65 and in temporal and cerebellar regions.17 Nonpsychotic subgroups within these populations may have similar abnormalities, albeit not to the extent found in individuals with ARMS-L.17,66 The discriminative features used by the HC1 vs ARMS-E classifier were consistent with these findings, but alterations of prefrontal, occipital, and thalamic structures may have additionally contributed to the good separability of ARMS-E. Our results suggest that subtle neuroanatomical abnormalities may underlie elevated disease vulnerability, potentially providing a valuable biomarker for the early precursors of psychosis.
Our second analysis showed that SVMs are capable of distinguishing between ARMS-T and ARMS-NT individuals based on patterns of structural abnormalities present before the onset of psychosis. Lateral and medial temporal abnormalities were found to separate converters from nonconverters, in keeping with previous voxel-based morphometric studies, which repeatedly showed that disease transition was associated with abnormalities of the perisylvian brain regions, the limbic and paralimbic areas, and the anterior cingulate cortex at baseline19 and over time.16,17,19,60
To date, only 1 study evaluated the feasibility of MR imaging–based psychosis prediction in a genetically defined ARMS population.67 The authors found that longitudinal GM density reductions in the inferior temporal gyrus predicted subsequent disease manifestation with a PPV of 60% and an NPV of 92%. These findings are not comparable to those of our study because the authors investigated longitudinal alterations in 3 regions of interest by using a univariate statistical procedure. Consistent with previous studies,26,28,29 the high discriminative power of our classification method indicates that multivariate whole-brain techniques have significant advantages over region-of-interest methods because they capture the correlatedness of morphological features across the entire brain. Furthermore, our tool may be more suitable for clinical applications because follow-up images are not needed, thus allowing for a rapid assessment of at-risk individuals. This is crucial for early recognition because the current UHR criteria may be most sensitive to an imminent risk of disease transition.8
Our results suggest that neuroanatomical pattern recognition techniques may add further diagnostic reliability to multivariate algorithms8,9 using clinical data for early recognition and disease prediction. In this regard, the high sensitivity observed in our classification analyses may compensate for the lower levels of sensitivity observed in the recently proposed clinical Cox regression models.8,9 Furthermore, MR imaging–based diagnostic techniques may be more widely applicable because they (1) do not depend on highly specialized mental health services, (2) do not rely on unstable clinical measurements, and (3) potentially are not confined to ARMS populations that can be segregated by means of multivariate patterns of clinical data, meaning that these individuals are already considerably ill at clinical examination. Nevertheless, the operationalized psychopathological criteria underlying our ARMS-L definition facilitated the recruitment of a sample with 70% subsequent disease transitions.
The high accuracy of our classification technique was obtained by 5-fold cross-validation, which provides a conservative measure of generalizability because it iteratively tests an independent one-fifth of the population against the rest of the data. This is further supported by the reliable classification of HCnew subjects. Both validation methods suggest that the early detection of individuals in different ARMSs of psychosis and the segregation of at-risk individuals who can expect a highly probable future disease manifestation could be achieved by SVM-based neurodiagnostic tools. However, this promising perspective does not imply that an individual admitted to the clinic with an ARMS-like pattern of psychopathological symptoms could be diagnosed with an equally high level of accuracy. This is because we used classifiers trained solely to categorize HCs and different ARMSs of psychosis as well as to predict a subsequent disease manifestation. It is unknown how strongly neuroanatomical “signatures” of other psychiatric disorders such as depression, bipolar disorder, or obsessive-compulsive disorder overlap with the discriminative patterns observed in different ARMSs of psychosis. Therefore, cross-nosological multigroup classifiers based on an SVM “library”28 of different neuropsychiatric disorders and their prodromal stages may help to reliably differentiate the ARMS from other conditions.
Finally, further limitations have to be considered. First, an independent replication of our results is needed in larger ARMS populations. Second, our method may not be applicable as a general population screening procedure because we examined help-seeking individuals who overwhelmingly showed subclinical symptoms to a certain degree. Thus, it is unknown whether the method could distinguish completely asymptomatic, genetically defined high-risk individuals from HCs. Furthermore, it is unclear whether the method can be generalized across different MR imaging equipment. A recent study showed that SVMs reliably classified patients with Alzheimer disease even if training and test data were acquired from different imagers.28 However, imager-induced noise will probably have a greater effect on the generalizability of the subtle neuroanatomical patterns underlying the ARMSs of psychosis.
In summary, our results suggest that SVM-based neuroanatomical pattern recognition techniques may substantially improve early-detection approaches that currently depend entirely on clinical information. Future projects may examine whether multimodal diagnostic tools integrating clinical, neuropsychological, neuroanatomical, and genetic markers could detect the ARMSs of psychosis and predict disease transition to a level of accuracy allowing for the preventive treatment of the disorder.
Correspondence: Eva M. Meisenzahl, MD, Clinic of Psychiatry and Psychotherapy, Ludwig-Maximilians-University, Nussbaumstrasse 7, 80336 Munich, Germany (Eva.Meisenzahl@med.uni-muenchen.de).
Submitted for Publication: July 22, 2008; final revision received November 12, 2008; accepted December 29, 2008.
Author Contributions: Drs Koutsouleris and Meisenzahl had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of data analysis. Drs Koutsouleris and Meisenzahl contributed equally.
Financial Disclosure: None reported.
Funding/Support: No funding was provided for the acquisition of MR imaging data. The Faculty of Medicine of the Ludwig-Maximilians-University (grant FöFoLe-444) provided financial support for the recruitment and clinical evaluation of the prodromal individuals. Furthermore, the development of methodological procedures used in this study was supported by the Federal Ministry for Education and Research research grants 01 EV0709 and 01 GW0740 (Dr Gaser).
Role of the Sponsors: The funding sources had no involvement in the study design, the collection and analysis of the data, or the writing of the manuscript.
Additional Contributions: Reinhold Bader, PhD (Linux Cluster Systems for the Munich and Bavarian Universities), integrated the VBM5 and HAMMER/RAVENS algorithms into the batch system of the Linux cluster. Fan Yong, MSc, and Xiaoying Wu, MSc (Section of Biomedical Imaging Analysis, University of Pennsylvania), provided excellent methodological support regarding the implementation of the HAMMER/RAVENS algorithms, Chih-Jen Lin, PhD (National Taiwan University, Taiwan), helped adjust the LIBSVM software to the needs of neuroimaging analysis, and L. J. P. van der Maaten, MSc (Maastricht ICT Competence Centre, Universiteit Maastricht, Maastricht, the Netherlands), provided support regarding the use of the dimensionality reduction toolbox.