[Skip to Navigation]
Original Investigation
December 7, 2022

Deep Learning for Cross-Diagnostic Prediction of Mental Disorder Diagnosis and Prognosis Using Danish Nationwide Register and Genetic Data

Author Affiliations
  • 1Copenhagen Research Centre for Mental Health, Mental Health Centre Copenhagen, Copenhagen University Hospital, Copenhagen, Denmark
  • 2Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
  • 3Division of Biostatistics and Department of Radiology, Population Neuroscience and Genetics Lab, University of California, San Diego, La Jolla
  • 4iPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
  • 5Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, Copenhagen, Denmark
  • 6Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
  • 7Institute of Biological Psychiatry, Mental Health Centre Sct Hans, Mental Health Services Copenhagen, Roskilde, Denmark
  • 8Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
JAMA Psychiatry. 2023;80(2):146-155. doi:10.1001/jamapsychiatry.2022.4076
Key Points

Question  Prior to the diagnosis, can it be predicted if an individual will be diagnosed with a severe mental disorder and the subsequent severity trajectories using the national Danish health registry and genetic data?

Findings  In this diagnostic study including 63 535 individuals, the specific diagnostic category within the mental disorder group could be predicted in a multidiagnostic model including a randomly sampled population control group. The most predictable group was the most severe group.

Meaning  Results suggest that the multidiagnostic model resembling a clinical setting prior to the examination can predict the mental disorder diagnosis with high accuracy based only on registry data and genetic information; prediction of the subsequent severity trajectory progression of the disorder based on information up to the diagnosis only performed with lower accuracy.


Importance  Diagnoses and treatment of mental disorders are hampered by the current lack of objective markers needed to provide a more precise diagnosis and treatment strategy.

Objective  To develop deep learning models to predict mental disorder diagnosis and severity spanning multiple diagnoses using nationwide register data, family and patient-specific diagnostic history, birth-related measurement, and genetics.

Design, Setting, and Participants  This study was conducted from May 1, 1981, to December 31, 2016. For the analysis, which used a Danish population-based case-cohort sample of individuals born between 1981 and 2005, genotype data and matched longitudinal health register data were taken from the longitudinal Danish population-based Integrative Psychiatric Research Consortium 2012 case-cohort study. Included were individuals with mental disorders (attention-deficit/hyperactivity disorder [ADHD]), autism spectrum disorder (ASD), major depressive disorder (MDD), bipolar disorder (BD), schizophrenia spectrum disorders (SCZ), and population controls. Data were analyzed from February 1, 2021, to January 24, 2022.

Exposure  At least 1 hospital contact with diagnosis of ADHD, ASD, MDD, BD, or SCZ.

Main Outcomes and Measures  The predictability of (1) mental disorder diagnosis and (2) severity trajectories (measured by future outpatient hospital contacts, admissions, and suicide attempts) were investigated using both a cross-diagnostic and single-disorder setup. Predictive power was measured by AUC, accuracy, and Matthews correlation coefficient (MCC), including an estimate of feature importance.

Results  A total of 63 535 individuals (mean [SD] age, 23 [7] years; 34 944 male [55%]; 28 591 female [45%]) were included in the model. Based on data prior to diagnosis, the specific diagnosis was predicted in a multidiagnostic prediction model including the background population with an overall area under the curve (AUC) of 0.81 and MCC of 0.28, whereas the single-disorder models gave AUCs/MCCs of 0.84/0.54 for SCZ, 0.79/0.41 for BD, 0.77/0.39 for ASD, 0.74/0.38, for ADHD, and 0.74/0.38 for MDD. The most important data sets for multidiagnostic prediction were previous mental disorders and age (11%-23% reduction in prediction accuracy when removed) followed by family diagnoses, birth-related measurements, and genetic data (3%-5% reduction in prediction accuracy when removed). Furthermore, when predicting subsequent disease trajectories of the disorder, the most severe cases were the most easily predictable, with an AUC of 0.72.

Conclusions and Relevance  Results of this diagnostic study suggest the possibility of combining genetics and registry data to predict both mental disorder diagnosis and disorder progression in a clinically relevant, cross-diagnostic setting prior to clinical assessment.

Add or change institution