A, The mean factor loadings from each subgroup within the discovery sample corresponding to functioning and quality of life (factor 1), suicide history (factor 2), depression symptoms and deficits (factor 3), environmental risk and male sex (factor 4), and psychosis symptoms and deficits (factor 5). B, The factor loadings separated for each subgroup containing the discovery sample factor loading (dark shade) and when the discovery sample factor loading solution was applied without modification to the replication sample factor loading (light shade). For the replication sample, the subgroup assignments were independent of the calculation of the sparse nonnegative matrix factorization components in the discovery sample.
Linear mixed-model quadratic trend analyses corrected for multiple comparisons (false-discovery rate correction). A, Psychosis symptoms as measured by the Positive and Negative Symptom Scale (PANSS) demonstrated significant differences in quadratic trends of the severe psychosis and depressive psychosis subgroups. B, Depressive symptoms as measured by the Inventory of Depressive Symptomatology (IDS-C30) demonstrated differences in the quadratic trend of the depressive psychosis subgroup compared with all other subgroups. C, General functioning as measured with the clinician-reported Global Assessment of Functioning (GAF) demonstrated differences between the severe psychosis subgroup and all other subgroups except the depressive psychosis subgroup. D, Quality of life as measured with the self-reported World Health Organization Quality of Life Questionnaire, brief version (WHOQoL-BREF), indicated differences between the depressive psychosis subgroup and both the affective psychosis and the higher-functioning psychosis subgroups. Data points indicate mean scores; error bars indicate SEs; lines connecting data points indicate fitted quadratic trends; and brackets indicate significant differences in quadratic trends. See eTables 9 and 10 in the Supplement for detailed statistics.
Differences based on genome-wide association studies for schizophrenia, bipolar disorder, major depressive disorder, and educational attainment calculated using analysis of covariance. For each polygenic score, 10 values are presented that reflect commonly used P value cutoff values56 for single-nucleotide polymorphism included in making the score. P values are displayed for the highest significant effect sizes across cutoff values for each polygenic score.56 A, When comparing the subgroups, the highest effect sizes were found for the education polygenic score, which was significant (uncorrected P < .05) across all P value cutoff values except for P < .0001 and P < .01. No other polygenic scores were significant across any threshold. B, When comparing diagnostic subgroups, significant differences were found for the schizophrenia (except for the P < .001 cutoff) and bipolar disorder (except for the P < .05 cutoff) polygenic scores but not for major depressive disorder or educational attainment. For further details, see eFigures 5 and 6 in the Supplement.
eMethods 1. Recruitment and sample characteristics.
eMethods 2. Baseline measures.
eMethods 3. Analysis overview.
eMethods 4. Participant and feature filtering.
eMethods 5. Clustering analyses.
eMethods 6. Genotyping and calculation of schizophrenia polygenic risk scores.
eMethods 7. Longitudinal analyses using mixed models.
eMethods 8. Identification of critical variables and replication analyses.
eResults 1. Subgroup determination.
eResults 2. Factor solutions.
eResults 3. Site and experimental rater effects.
eResults 4. Validation of subgroups: further explanation.
eResults 5. Supplementary analysis: clustering stability for different imputation settings.
eResults 6. Supplementary analysis: analysis of separability based on cognitive variables.
eResults 7. Supplementary analysis: analysis of differences in lifetime illness course.
eResults 8. Supplementary analysis: analysis of missing participants in longitudinal data and mixed models.
eFigure 1. Analysis flowchart and overview.
eFigure 2. Sparse nonnegative matrix factorization (sNMF) consensus clustering results.
eFigure 3. Proportion of diagnoses across the subgroups.
eFigure 4. Illness course comparisons of discovery and validation cohorts.
eFigure 5. Effect sizes of polygenic scores differentiating subgroups.
eFigure 6. Effect sizes of polygenic scores differentiating diagnostic groups.
eFigure 7. Violin plots of education polygenic scores.
eFigure 8. Genetic ancestry overlap with the European reference population.
eFigure 9. Günzburg site exclusion factor and consistency matrix results.
eFigure 10. Repetition of validation analyses after excluding infectious diseases variable.
eFigure 11. Factor matrices of clustering solution across different K-nearest neighbors.
eFigure 12. Feature importance for the supplementary analysis of cognition.
eFigure 13. Assessment of lifetime illness course using the Operational Criteria Checklist for Psychotic Illness and Affective Illness (OPCRIT).
eBox. Abbreviations used throughout supplementary materials.
eTable 1. Unfiltered features originally selected for analyses.
eTable 2. Variables excluded from clustering analyses.
eTable 3. Comparisons of discovery sample with excluded participants and the validation sample.
eTable 4. Top 10 features and mean factor weights.
eTable 5. Differences between the sparse nonnegative matrix factorization–derived subgroups across six clinical domains.
eTable 6. Differences between sparse nonnegative matrix factorization–derived subgroups (schizophrenia only).
eTable 7. Proportions (No. [%]) of individuals in each group across PsyCourse sites.
eTable 8. Site, rater, and site × rater analysis of variance analyses.
eTable 9. Mixed-model analysis of illness course.
eTable 10. Post hoc analysis of mixed-model quadratic trends.
eTable 11. Multigroup classification performance in the discovery set.
eTable 12. Differences between subgroups in the validation sample across 6 clinical domains.
eTable 13. Classification of the discovery and replication samples for each subgroup.
eTable 14. Somatic variables requiring exclusion for the Günzburg replacement analyses.
eTable 15. Günzburg exclusion and replacement clinical comparison table.
eTable 16. Günzburg exclusion site comparison table.
eTable 17. Günzburg site replacement site comparison table.
eTable 18. Supervised learning classification of subgroups using cognitive variables.
eTable 19. Total number of participants across time points for each subgroup.
eTable 20. Mixed-model analyses controlling for missing data.
eTable 21. Post hoc analysis of mixed models controlling for missing data.
Customize your JAMA Network experience by selecting one or more topics from the list below.
Identify all potential conflicts of interest that might be relevant to your comment.
Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.
Err on the side of full disclosure.
If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.
Not all submitted comments are published. Please see our commenting policy for details.
Dwyer DB, Kalman JL, Budde M, et al. An Investigation of Psychosis Subgroups With Prognostic Validation and Exploration of Genetic Underpinnings: The PsyCourse Study. JAMA Psychiatry. 2020;77(5):523–533. doi:10.1001/jamapsychiatry.2019.4910
Will data-driven clustering using high-dimensional clinical data reveal psychosis subgroups with relevance to prognoses and polygenic risk?
In this cohort study including 1223 individuals, in the discovery sample of 765 individuals with predominantly bipolar and schizophrenia diagnoses, 5 subgroups were detected with different clinical signatures, illness trajectories, and genetic scores for educational attainment. Results were validated in a sample of 458 individuals.
New data-driven clustering paired with rigorous validation may offer a means to extend symptom-based psychosis taxonomies toward functional outcomes, genetic markers, and trajectory-based stratifications.
Identifying psychosis subgroups could improve clinical and research precision. Research has focused on symptom subgroups, but there is a need to consider a broader clinical spectrum, disentangle illness trajectories, and investigate genetic associations.
To detect psychosis subgroups using data-driven methods and examine their illness courses over 1.5 years and polygenic scores for schizophrenia, bipolar disorder, major depression disorder, and educational achievement.
Design, Setting, and Participants
This ongoing multisite, naturalistic, longitudinal (6-month intervals) cohort study began in January 2012 across 18 sites. Data from a referred sample of 1223 individuals (765 in the discovery sample and 458 in the validation sample) with DSM-IV diagnoses of schizophrenia, bipolar affective disorder (I/II), schizoaffective disorder, schizophreniform disorder, and brief psychotic disorder were collected from secondary and tertiary care sites. Discovery data were extracted in September 2016 and analyzed from November 2016 to January 2018, and prospective validation data were extracted in October 2018 and analyzed from January to May 2019.
Main Outcomes and Measures
A clinical battery of 188 variables measuring demographic characteristics, clinical history, symptoms, functioning, and cognition was decomposed using nonnegative matrix factorization clustering. Subtype-specific illness courses were compared with mixed models and polygenic scores with analysis of covariance. Supervised learning was used to replicate results in validation data with the most reliably discriminative 45 variables.
Of the 765 individuals in the discovery sample, 341 (44.6%) were women, and the mean (SD) age was 42.7 (12.9) years. Five subgroups were found and labeled as affective psychosis (n = 252), suicidal psychosis (n = 44), depressive psychosis (n = 131), high-functioning psychosis (n = 252), and severe psychosis (n = 86). Illness courses with significant quadratic interaction terms were found for psychosis symptoms (R2 = 0.41; 95% CI, 0.38-0.44), depression symptoms (R2 = 0.28; 95% CI, 0.25-0.32), global functioning (R2 = 0.16; 95% CI, 0.14-0.20), and quality of life (R2 = 0.20; 95% CI, 0.17-0.23). The depressive and severe psychosis subgroups exhibited the lowest functioning and quadratic illness courses with partial recovery followed by reoccurrence of severe illness. Differences were found for educational attainment polygenic scores (mean [SD] partial η2 = 0.014 [0.003]) but not for diagnostic polygenic risk. Results were largely replicated in the validation cohort.
Conclusions and Relevance
Psychosis subgroups were detected with distinctive clinical signatures and illness courses and specificity for a nondiagnostic genetic marker. New data-driven clinical approaches are important for future psychosis taxonomies. The findings suggest a need to consider short-term to medium-term service provision to restore functioning in patients stratified into the depressive and severe psychosis subgroups.
Schizophrenia and bipolar diagnoses group patients on the basis of shared patterns of psychiatric history, symptoms, and illness courses.1,2 The categories drive research and clinical practice,3 despite evidence of symptomatic ambiguity,4,5 heterogeneous illness courses,6 and overlapping genetic risk profiles.7-9 The existing taxonomy has been questioned as a result,10,11 and new psychosis configurations are under consideration.3,12-15
Novel symptom subgroups15-19 or dimensions13-15,20-22 have been proposed using unbiased statistical approaches, but assessing symptoms alone does not account for the complexity of disease phenotypes involving the patients’ history, illness course, cognition, or daily functioning.1,12,20 These factors were important to historical taxonomic formulations,1,23 critical to subgroup hypotheses (eg, deficit schizophrenia12,21,22), and have biological associations suggestive of a genetic23,24 or brain25 diathesis. While some unbiased studies have included such additional information,26 the breadth and depth of measures does not match the detail of naturalistic clinical assessments that originally defined psychosis categories.1,26 If diagnoses are to be reconsidered, then it is important to assess a similarly broad range of variables.
New clustering methods27-30 provide an opportunity to define subgroups based on high-dimensional data31 rather than biasing analyses by focusing on single domains. Fundamentally, the methods use computational approaches to define stable, separable, and interpretable subgroup solutions. Thus, the first aim of this study was to adapt clustering methods used in oncology27-30 to identify psychosis subgroups with a highly representative phenomenological battery of variables in a sample of individuals with established diagnoses of psychotic disorders (mainly schizophrenia and bipolar disorders). Rather than prespecifying variables, a data-driven approach was used to refine a large battery that would ideally be collected during comprehensive clinical assessments for clinical management and treatment planning.
Longitudinal studies of illness course were also critical to traditional psychosis nosologies1,32 and is an essential component of a clinically valid and useful diagnosis.11 However, with some notable exceptions,33 studies have been mostly confined to trajectories based on diagnostic groups,34-36 symptoms,33,34 or general functioning36,37 with long follow-up periods that obscure illness dynamics relevant in a clinical setting where decisions need to be made over the course of weeks or months rather than years. As such, the second aim of the study was to validate the subgroups using longitudinal data collected in 6-month intervals over 1.5 years based on symptom and functioning measures.
Evidence of specific genetic signatures have critically informed psychosis categorizations3,9,35,37,38 and dimensions.9,39-41 For example, polygenic risk for schizophrenia is positively related to psychotic symptoms,9,39-41 treatment resistance,24 and hospitalizations.42 Recently, genetic risk scores for reduced educational attainment have also suggested a cognitive disorder subtype of schizophrenia.40 However, to our knowledge, such genetic risk scores have not been compared in diagnostically mixed subgroups derived from unbiased statistical clustering. Thus, the third aim of the study was to investigate how the subgroups differed based on polygenic scores for schizophrenia, bipolar disorder, major depression disorder, and educational attainment.
Based on the established psychiatric nosologies as well as prior research,12,22 we expected to find a subgroup of individuals with nonaffective diagnoses and an early illness onset, sustained poor functioning, persistent symptoms, high schizophrenia polygenic scores, and low educational attainment scores. On the other end of the spectrum, we expected to see a subgroup of individuals with affective psychosis characterized by less severe symptoms, high bipolar polygenic risk, and retained functioning.43
Participants were included from an ongoing multisite, naturalistic, longitudinal cohort study being conducted in 18 clinical sites in Germany and Austria beginning in January 2012 (Pathomechanisms and Signature in the Longitudinal Course of Psychosis [PsyCourse]44; http://www.psycourse.de; SCHU 1603/4-1, 5-1, 7-1). Adult participants 18 years and older were identified based on referrals from the clinical staff or by querying patient registries (eMethods 1 in the Supplement). The following diagnoses were included using the Structured Clinical Interview for DSM-IV, Axis I Disorders (SCID-I45): recurrent depressive disorder, bipolar II disorder, bipolar I disorder, schizoaffective disorder, brief psychotic disorder, schizophreniform disorder, and schizophrenia. Individuals were excluded if they did not meet DSM-IV diagnostic criteria, had insufficient language ability, or were intellectually impaired. The study protocol was ethically approved by independent committees at each site, all participants gave written informed consent, and raters were trained.
The present study divided the sample into a discovery set consisting of 765 individuals, after excluding 126 individuals with 25% or more missing baseline data (eMethods 4 in the Supplement), from the PsyCourse version 1.0 data release (September 2016) and an unmatched validation data set consisting of an additional 458 individuals from the PsyCourse version 3.0 data release (October 2018). Data releases were randomly based on the convenience sample acquisition rate and data processing times. Individuals excluded because of missing data from the discovery cohort had higher psychotic and depressive symptoms, and the validation cohort included a higher percentage of individuals with recurrent depression (eTable 3 in the Supplement). For details, see eMethods 1 and 4 in the Supplement and Budde et al.44
The analysis overview is explained in the eMethods 3 and eFigure 1 in the Supplement. A data-driven approach was used to select baseline variables. From an initial inclusive list of 230 variables that included single test items (eMethods 2 and eTable 1 in the Supplement), variables were excluded if more than 95% of values were equal (n = 10) or more than 25% values were missing (n = 32) (eTable 2 in the Supplement), before being scaled (0, 1) and imputed using a nearest-neighbor method46 (3.5% imputed). This resulted in a total of 188 remaining baseline variables (eTable 1 in the Supplement) assessing the domains of medical history (eg, family history, hospitalizations), symptoms (eg, psychosis, suicidality), cognition (eg, attention, speed, working memory, verbal IQ), and functioning (eg, self-reported and clinician-reported). Specific variables were selected a priori to examine longitudinal courses (eMethods 7 in the Supplement), including the Positive and Negative Symptom Scale (PANSS),47 the Inventory of Depressive Symptomatology (IDS-C30),48 the Young Mania Rating Scale (YMRS),49 the World Health Organization Quality of Life Questionnaire, brief version (WHOQoL-BREF),50 and the Global Assessment of Functioning (GAF).51
We aimed to find stable, interpretable, and clinically separable subgroups by adapting and extending a novel clustering approach (nonnegative matrix factorization [NMF] consensus clustering) that has been successfully used in oncology27-30 to the 188 variables included at the baseline time point (eMethods 5 in the Supplement). The technique reduces data into parsimonious factors, selects clusters based on their stability, and can identify nonlinear and non-Gaussian boundaries.31 MATLAB release R2015b (MathWorks) was used for analyses, and code is available on request.
Samples were genotyped using the Infinium CoreExome-24+Human PsyChip Consortium, versions 1.0 and 1.1 (Illumina). Standard procedures were used to calculate polygenic risk scores (PRSs) based on the latest summary statistics of genome-wide association studies of schizophrenia,52 bipolar disorder,53 major depressive disorder,54 and educational attainment.55 Genetic risk burden was calculated for each individual at 10 commonly used56 PRS thresholds between P < .0005 and P > .99 with established methods57 (eMethods 6 and eFigure 8 in the Supplement).
Subgroups were clinically characterized by investigating NMF scores and comparing subgroups across baseline variables (using analysis of variance and χ2 tests; MATLAB release R2015b; significance set at a false-discovery rate–corrected 2-tailed P value less than .05) based on previous research.12,22,58 Illness course was investigated over the 3 longitudinal time points at 6-month intervals using mixed models (R version 3.6.0 [The R Foundation]; significance set at a false-discovery rate–corrected 2-tailed P value less than .05) (eMethods 7 in the Supplement). Polygenic risk scores were analyzed with analysis of covariance using the first 4 ancestry principal (principal component analysis) components as covariates testing PRS subgroup differences across the 10 selected thresholds (SPSS version 25 [IBM]; significance set at a 2-tailed P value less than .05).9,39
Using 188 baseline features to determine subgroups enhances the similarity to clinical reality but limits replicability and clinical utility. To simultaneously address these limitations and validate the subgroups, we used a separate supervised machine learning analysis using NeuroMiner (https://github.com/neurominer-git; MATLAB release 2017a) (eMethods 8 in the Supplement) to (1) reduce dimensionality by building a subgroup classifier using the top 10 most highly weighted features from each NMF factor in the discovery sample and (2) apply the models to the 458 individuals from the validation cohort to assign subgroup labels. Using the assigned subgroup labels, we then determined the NMF factor loadings for each subgroup, clinically compared the subgroups, analyzed PRS differences, and applied the mixed-model trajectory models (eMethods 8 in the Supplement).
Among the 765 individuals in the discovery sample, 341 (44.6%) were women, and the mean (SD) age was 42.7 (12.9) years (Table 1). These individuals received DSM-IV diagnoses of recurrent major depression (n = 2 [0.3%]); bipolar II disorder (n = 60 [7.8%]), bipolar I disorder (n = 256 [33.4%]), schizoaffective disorder (n = 73 [9.5%]), brief psychotic disorder (n = 6 [0.8%]), and schizophreniform disorder (n = 10 [1.3%]); and schizophrenia (n = 358 [46.8%]). All patients were included in longitudinal mixed-model analyses and had an average (SD) of 2.5 (0.1) follow-up assessments. There were a total of 458 patients included in the validation sample at baseline (eTable 3 in the Supplement) and 453 included in longitudinal analyses (5 were excluded with completely missing data), with an average (SD) of 1.8 (0.1) follow-up assessments.
Five subgroups were identified (Figure 1; eResults 1 and eFigure 2 in the Supplement), as mediated by a factor solution comprised by coherent mixtures of demographic, diagnostic, treatment, symptom, and functioning questionnaire items (Table 2; eResults 2 and eTable 4 in the Supplement). Broadly, the 5 factors could be summarized as representing quality of life, suicide history, depression symptoms and deficits, environmental risk and male sex, and psychosis symptoms and deficits. The clustering approach27-30 links the factorization with the subgroup determination (eMethods 5 in the Supplement); thus, the 5 subgroups preferentially loaded on each factor.
Subgroups were ordered based on the percentage of individuals with a diagnosis of schizophrenia (eFigure 3 in the Supplement) and interpreted in reference to their association with the factor scores (Figure 1) and a battery of commonly used variables (Table 1; eTable 5 in the Supplement). The first subgroup (n = 252) was labeled as the affective psychosis subgroup and was associated with a mean (SD) age at first inpatient treatment of 35.6 (13.0) years, female sex, mild symptom severity, and high levels of functioning and education. In contrast, subgroup 5 (n = 86) was labeled as the severe psychosis subgroup, as it contained patients with schizophrenia diagnoses and substantially lower educational achievement (with 41 [48%] having less than 12 years of schooling), low verbal intelligence, male sex, high symptoms of psychosis (but not of depression or mania), and low GAF scores. Supplementary analyses also showed that this was the only subgroup that could be identified when using cognitive variables (eResults 6, eFigure 12, and eTable 18 in the Supplement).
The remaining subgroups could be distinguished between the extremes of high-functioning affective psychosis and severe psychosis (Table 1; eTable 5 in the Supplement). Subgroup 2 (n = 44) was labeled as suicidal psychosis since it was most distinguishable by a high loading on the suicide factor (eFigure 2 in the Supplement), but a high percentage of women (70% [31 of 44]) and moderate symptoms/functioning was also notable. Subgroup 3 (n = 131) was labeled as the depressive psychosis subgroup because it loaded highly on the depressive factor, and their depressive symptoms were double that of the next highest subgroup (Table 1). Subgroup 4 (n = 252) was labeled as the high-functioning psychosis subgroup, which consisted of predominantly men (76.2% [192 of 252]) with relatively low symptom levels and relatively high global functioning.
Supplementary analyses were also conducted. To control for diagnosis, we investigated subgroup differences only in 358 individuals with schizophrenia and found similar results (eTable 6 in the Supplement). Site differences between subgroups were found (eTable 7 in the Supplement), but further analyses reduced the possibility of systematic rater and site biases (eResults 3, eFigures 9 and 10, and eTables 8, 14, 15, 16, and 17 in the Supplement). Factor solutions were stable when preprocessing parameters were changed (eResults 5 and eFigure 11 in the Supplement).
Mixed models containing subgroup, linear, quadratic, and interaction terms were found for PANSS (R2 = 0.41; 95% CI, 0.38-0.44), IDS-C30 (R2 = 0.28; 95% CI, 0.25-0.32), GAF (R2 = 0.16; 95% CI, 0.14-0.20), and WHOQoL-BREF (R2 = 0.20; 95% CI, 0.17-0.23) (Figure 2; eTable 9 in the Supplement). The interaction of subgroup with quadratic trends significantly improved the models and were analyzed in post hoc analyses (eTable 10 in the Supplement). Pairwise tests of quadratic trends revealed increases in the severe psychosis subgroup (PANSS and GAF) and the depressive psychosis subgroup (PANSS, IDS-C30, GAF, and WHOQoL-BREF) (Figure 2). Supplementary analyses demonstrated that the quadratic illness course of the severe psychosis subgroup occurred against a background of long-term ongoing illness (eResults 7 and eFigure 13 in the Supplement). Attrition was noted, but controlling for this variable did not affect longitudinal estimates (eResults 8 and eTables 20 and 21 in the Supplement).
The highest effect sizes separating the subgroups were found for educational attainment polygenic score (mean [SD] partial η2 = 0.014 [0.003]; Figure 3). Significance was found across 8 of 10 thresholds (partial η2 > 0.015; uncorrected P < .05), and post hoc analyses indicated a reduction of the educational attainment polygenic score in the severe psychosis subgroup (eFigure 7 in the Supplement). For comparison purposes, PRS differences between diagnostic subgroups (DSM-IV) were analyzed (Figure 3; eFigure 6 in the Supplement). Results demonstrated expected large effects for the schizophrenia and bipolar polygenic scores but not for the major depression or educational attainment polygenic scores.
Subgroups from the discovery sample could be robustly separated using the features in Table 2 as expected (eTable 11 in the Supplement). Application of the models to the validation cohort replicated the detected phenotypes for the factor solutions (Figure 1; eResults 4 and eTable 13 in the Supplement), distinguishing baseline clinical features (eTable 12 in the Supplement), longitudinal courses (eFigure 4 in the Supplement), and educational attainment polygenic scores (eFigures 5, 6, and 7 in the Supplement). However, the suicidal psychosis subgroup was not well replicated in longitudinal analyses because of missing data (eFigure 4 and eTable 19 in the Supplement). Also, in contrast to the discovery sample, a comparative increase in the effect size of subgroup differences using the schizophrenia PRS was found (eFigure 5 in the Supplement).
Five psychosis subgroups were detected demonstrating distinctive clinical signatures, 18-month illness courses, and polygenic scores for educational achievement. The identified subgroup solution has not been reported before within a single study, to our knowledge. In partial agreement with current diagnostic systems, the results broadly supported the hypotheses of affective and nonaffective subgroup extrema. However, our results refine these groups and critically introduce intermediate subgroups with mixed diagnoses and divergent functional outcomes.
The identification of a severe psychosis subgroup with a poor educational history, high psychosis symptoms, and low functioning broadly agrees with longstanding hypotheses of a deficit form of schizophrenia12,21,22 with a developmental and/or genetic origin. However, the symptom profile of individuals in this group was not limited to negative symptoms, and their illness course was suggestive of remitting symptomatic (eg, PANSS) and remitting-relapsing functional (eg, GAF) patterns rather than stable impairment. These results imply that such individuals benefit from treatment and also that functional interventions should at least cover a period of up to 18 months to prevent functional relapse. Lifetime illness course in this subgroup was predominantly estimated to be chronic, which highlights the importance of studying such shorter illness dynamics that fluctuate and could be targeted with optimized treatment (eResults 7 in the Supplement).
The clear distinction between the severe psychosis subgroup and the high-functioning psychosis subgroup with mixed bipolar and schizophrenia diagnoses (ie, men with more education, less symptom severity, and stable course) potentially suggests a different illness phenotype.3,59,60 These phenotypes may be divided based on the relative contribution of developmental (severe psychosis) and environmental (high-functioning psychosis) risks. This hypothesis was supported by the finding of a reduced educational achievement polygenic score in the severe psychosis subgroup, potentially suggesting neurodevelopmental contributions.4
A second major finding was of a depression subgroup with mixed diagnoses, low functioning, moderate depression, and a remitting-relapsing functional course (Figure 2). These results extend the notion of schizodepression61 by suggesting that the subgroup is diagnostically mixed (eg, 53 of 131 [40.5%] were diagnosed with schizophrenia and 61 of 131 [46.6%] with bipolar I/II disorder) and experiences a similar functional course as the severe psychosis subgroup. In this context, it is important to note that while this course is most likely representative of a treatment response, both subgroups did not fulfil criteria for symptom remission (ie, 50%61,62) prior to their relapse between 12 and 18 months. Clinically, this further implies that addressing treatment resistance early and following up, especially between 12 and 18 months, is important for both subgroups.
A related finding was of a diagnostically mixed subgroup characterized by suicide history, which may either support research indicating that suicidality is a separable trait that is not connected to a specific diagnosis62-64 or that a personality disorder subgroup was detected (eg, borderline personality disorder65). A limitation of the study was that personality disorders were not assessed, but despite this, the results may suggest a need for further transdiagnostic research62-64 and a role for suicide-specific treatments.
Polygenic scores reflecting educational attainment exhibited relatively high effect sizes across discovery and validation analyses. Similar to genetic subtyping research,23 the finding of decreased educational attainment scores in the severe psychosis subgroup highlights the specificity of the association with subgroups and not diagnostic categories.66 The lack of a consistent association of diagnostic polygenic scores with schizophrenia, bipolar disorder, or depression indicated their lack of specificity for functional or symptom severity in this study.9,43,44 The results suggest a need for new risk scores reflecting other transdiagnostic factors, such as developmental risk (eg, birth history), functioning (eg, social impairment), and illness course (eg, remitting-relapsing). Such transdiagnostic scores may reveal previously unknown gene candidates and illness mechanisms.
The study highlights the power of using recently developed bioinformatics methods that can accommodate high-dimensional clinical data. The results partially agree with a separation of affective and nonaffective psychosis subgroups, but they also reinforce the need to look beyond conventional diagnostic categories, symptom domains, and polygenic scores to better understand psychosis heterogeneity. Doing so has the potential to facilitate a research transition from traditional case-control comparisons3,13 to modular taxonomies based on intersections of comorbid symptoms (eg, depression), premorbid and current functioning, and illness courses.11 Clinically, by separating individuals into subgroups linked to distinct baseline and longitudinal patterns, more effective resource allocation could be achieved because of better functional matching. Monitoring short-term to medium-term illness trajectories in chronic subgroups (eg, the severe and depressed subgroups) could also enhance treatment engagement and adherence. The precision of clinical and biological research could also be improved by using the subgroups. To determine subgroup labels using new data, a prototype web interface has been developed and can be accessed at http://www.proniapredictors.eu.
The study has a number of limitations. Our aim was to use an unbiased data-driven approach, but all studies create subgrouping biases because of their selection of measures when the study is designed.2 Data from participants who refused to take part were not recorded and thus participation biases could not be assessed. We also attempted to study the illness severity spectrum but were restricted to naturalistic hospital settings where very mild cases would not be available and thus higher-functioning subgroups could be expected. Interventions were uncontrolled, and we found a treatment status bias (eg, more inpatients in the severe psychosis subgroup). Conclusions also cannot be drawn regarding schizophreniform disorder or brief psychotic disorder because of their low representation in this study. Attrition over the 18-month follow-up period emphasized a need for external replication and future research could consider longer follow-up periods, but it should be noted that investigating this timeframe is critical for translational purposes, as it defines an important operational window for clinical management. Additionally, the subgroups were validated in a separate group from the same study, and further validation is required.
The results of this study inform the intensifying efforts to redefine psychosis taxonomies but do so using criteria that include the assessment of clinical history, functioning, illness course, and genetic risk factors. Further research is needed to investigate communalities and discrepancies between different subtyping results to develop a consensus among competing taxonomic solutions. Etiological and treatment implications of the present subgroups need to be studied to move toward targeted mechanistic research and clinical care.
Accepted for Publication: November 25, 2019.
Corresponding Author: Dominic B. Dwyer, PhD (firstname.lastname@example.org), and Nikolaos Koutsouleris, MD (email@example.com), Department of Psychiatry and Psychotherapy, University Hospital, Ludwig Maximilian University of Munich, Nussbaumstr 7, D-80336 Munich, Germany.
Published Online: February 12, 2020. doi:10.1001/jamapsychiatry.2019.4910
Author Contributions: Dr Dwyer had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Drs Schulze and Koutsouleris contributed equally to this work.
Study concept and design: Dwyer, Kambeitz, Hasan, Anderson-Schmidt, Reich-Erkelenz, Adorjan, Gryaznova, Lang, Rietschel, Schulze, Koutsouleris.
Acquisition, analysis, or interpretation of data: Dwyer, Kalman, Budde, Kambeitz, Ruef, Antonucci, Kambeitz-Ilancovic, Hasan, Kondofersky, Anderson-Schmidt, Gade, Adorjan, Senner, Schaupp, Andlauer, Comes, Schulte, Klöhn-Saghatolislam, Hake, Bartholdi, Flatau-Nagel, Reitt, Quast, Stegmaier, Meyers, Emons, Haußleiter, Juckel, Nieratschker, Dannlowski, Yoshida, Schmauß, Zimmermann, Reimer, Wiltfang, Reininghaus, Anghelescu, Arolt, Baune, Konrad, Thiel, Falgatter, Figge, von Hagen, Koller, Wigand, Becker, Jäger, Dietrich, Scherk, Spitzer, Folkerts, Witt, Degenhardt, Forstner, Nöthen, Mueller, Papiol, Heilbronner, Falkai, Schulze.
Drafting of the manuscript: Dwyer, Kambeitz, Hasan, Gryaznova, Folkerts, Schulze.
Critical revision of the manuscript for important intellectual content: Dwyer, Kalman, Budde, Kambeitz, Ruef, Antonucci, Kambeitz-Ilancovic, Hasan, Kondofersky, Anderson-Schmidt, Gade, Reich-Erkelenz, Adorjan, Senner, Schaupp, Andlauer, Comes, Schulte, Klöhn-Saghatolislam, Hake, Bartholdi, Flatau-Nagel, Reitt, Quast, Stegmaier, Meyers, Emons, Haußleiter, Juckel, Nieratschker, Dannlowski, Yoshida, Schmauß, Zimmermann, Reimer, Wiltfang, Reininghaus, Anghelescu, Arolt, Baune, Konrad, Thiel, Falgatter, Figge, von Hagen, Koller, Lang, Wigand, Becker, Jäger, Dietrich, Scherk, Spitzer, Witt, Degenhardt, Forstner, Rietschel, Nöthen, Mueller, Papiol, Heilbronner, Falkai, Schulze, Koutsouleris.
Statistical analysis: Dwyer, Kambeitz, Ruef, Antonucci, Andlauer, Yoshida, Papiol, Koutsouleris.
Obtained funding: Dannlowski, Folkerts, Rietschel, Nöthen, Schulze, Koutsouleris.
Administrative, technical, or material support: Dwyer, Kalman, Budde, Ruef, Gade, Reich-Erkelenz, Adorjan, Senner, Comes, Schulte, Klöhn-Saghatolislam, Gryaznova, Hake, Bartholdi, Reitt, Quast, Stegmaier, Meyers, Haußleiter, Nieratschker, Dannlowski, Reimer, Wiltfang, Reininghaus, Anghelescu, Arolt, Thiel, Figge, von Hagen, Wigand, Scherk, Spitzer, Folkerts, Degenhardt, Forstner, Rietschel, Nöthen, Heilbronner, Schulze, Koutsouleris.
Study supervision: Hasan, Adorjan, Juckel, Reininghaus, Anghelescu, Baune, Konrad, Falgatter, Figge, von Hagen, Lang, Jäger, Nöthen, Mueller, Falkai, Schulze, Koutsouleris.
Conflict of Interest Disclosures: Dr Hasan has received personal fees for speaking from and serves on the advisory boards of Lundbeck, Janssen Pharmaceuticals, and Otsuka Pharmaceutical and is the editor of the World Federation of Societies of Biological Psychiatry guidelines on schizophrenia and coeditor of the Association of the Scientific Medical Societies in Germany (AWMF) S3 guideline on schizophrenia. Drs Gade, Klöhn-Saghatolislam, Dannlowski, Wigand, Witt, and Schulze and Ms Reich-Erkelenz have received grants from the German Research Foundation (DFG). Dr Schmauß has received grants from the Institute of Psychiatric Phenomics and Genomics and personal fees for being a member of the speaker’s bureau and advisory board from Aristo Pharmaceuticals, Janssen-Cilag, Lundbeck, Neuraxpharm, and Recordati. Dr Reimer has received personal fees from Lundbeck and Otsuka Pharmaceutical. Dr Wiltfang has received grants from the German Research Foundation and personal fees from Abbott Laboratories, Boehringer Ingelheim, Immungenetics, Eli Lilly and Company, MSD Sharp & Dohme, Roche, Actelion, Amgen, Janssen-Cilag, Pfizer, and Med Update and has patents PCT/EP 2011001724 and PCT/EP 2015052945 issued. Dr Arolt has received personal fees from Janssen Pharmaceuticals, AstraZeneca, Lundbeck, Sanofi, and Servier Laboratories. Dr Konrad has received personal fees from Aristo Pharmaceuticals, Janssen-Cilag, Eli Lilly and Company, MagVenture, Trommsdorff GmbH, Lundbeck, Neuraxpharm, and Servier Laboratories. Dr Nöthen has received grants from the German Federal Ministry of Education and Research (BMBF). Dr Falkai has received research support and honorarium for lectures or advisory activity from Abbott Laboratories, Boehringer Ingelheim, Janssen Pharmaceuticals, Essex Pharma, Lundbeck, Otsuka Pharmaceutical, Recordati, Richter Pharma, Servier Laboratories, and Takeda. Dr Koutsouleris has received personal fees from Lundbeck and Otsuka Pharmaceutical and has a patent to US20160192889A1 issued. No other disclosures were reported.
Funding/Support: This work was supported by the German Research Foundation (DFG) within the framework of the projects Genotype-Phenotype Relationships and Neurobiology of the Longitudinal Course of Psychosis and Pathomechanisms and Signatures in the Longitudinal Course of Psychosis (SCHU 1603/4-1, 5-1, 7-1; FA241/16-1). The genotyping was funded in part by the German Federal Ministry of Education and Research (BMBF) through the Integrated Understanding of Causes and Mechanisms in Mental Disorders, under the auspices of the e:Med Program with grants awarded to Drs Rietschel (grant 01ZX1614G), Nöthen (grant 01ZX1614A), and Schulze (grant 01ZX1614K). Dr Dannlowski was funded by the German Research Foundation (DFG) (grant FOR2107 DA1151/5-1, 5-2; SFB-TRR58, Project C09 and Z02) and the Interdisciplinary Center for Clinical Research (IZKF) of the medical faculty of Münster (grant Dan3/012/17). Dr Wiltfang is supported by an Ilídio Pinho professorship and the Institute of BioMedicine (UID/BIM/04501/2013) at the University of Aveiro. Dr Witt was supported by grants FOR2107 WI 3439/3-1 and WI 3439/3-2 from the German Research Foundation (DFG). Dr Degenhardt received support from the BONFOR Programme of the University of Bonn. Dr Rietschel was supported by grants RI 908/11-1 and RI 908/11-2 from the German Research Foundation (DFG) and received additional support from grant 01EW1810 from the German Federal Ministry of Education and Research within the framework of ERA-NET NEURON. Dr Schulze received additional support from the German Federal Ministry of Education and Research within the framework of the BipoLife network and the Dr Oehler Foundation. The work was further supported by funding awarded through the European Union 7th Framework Programme for the PRONIA project to Dr Koutsouleris.
Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Additional Contributions: We would like to express our profound gratitude to all study participants, interviewers, and laboratory and administrative personnel, without whom this work would not have been possible.