Validation of an Electronic Health Record–Based Suicide Risk Prediction Modeling Approach Across Multiple Health Care Systems | Electronic Health Records | JAMA Network Open | JAMA Network
[Skip to Navigation]
Access to paid content on this site is currently suspended due to excessive activity being detected from your IP address Please contact the publisher to request reinstatement.
Kochanek  KD, Murphy  S, Xu  J, Arias  E.  Mortality in the United States, 2016.  NCHS Data Brief. 2017;(293):1-8.PubMedGoogle Scholar
Hedegaard  H, Curtin  SC, Warner  M.  Suicide mortality in the United States, 1999-2017.  NCHS Data Brief. 2018;(330):1-8.PubMedGoogle Scholar
Hedegaard  H, Curtin  SC, Warner  M.  Suicide rates in the United States continue to increase.  NCHS Data Brief. 2018;(309):1-8.PubMedGoogle Scholar
Substance Abuse and Mental Health Services Administration. Key substance use and mental health indicators in the United States: results from the 2016 National Survey on Drug Use and Health. Published 2017. Accessed February 19, 2020.
Franklin  JC, Ribeiro  JD, Fox  KR,  et al.  Risk factors for suicidal thoughts and behaviors: a meta-analysis of 50 years of research.  Psychol Bull. 2017;143(2):187-232. doi:10.1037/bul0000084PubMedGoogle ScholarCrossref
Barak-Corren  Y, Castro  VM, Javitt  S,  et al.  Predicting suicidal behavior from longitudinal electronic health records.  Am J Psychiatry. 2017;174(2):154-162. doi:10.1176/appi.ajp.2016.16010077PubMedGoogle ScholarCrossref
Simon  GE, Johnson  E, Lawrence  JM,  et al.  Predicting suicide attempts and suicide deaths following outpatient visits using electronic health records.  Am J Psychiatry. 2018;175(10):951-960. doi:10.1176/appi.ajp.2018.17101167PubMedGoogle ScholarCrossref
Walsh  CG, Ribeiro  JD, Franklin  JC.  Predicting suicide attempts in adolescents with longitudinal clinical data and machine learning.  J Child Psychol Psychiatry. 2018;59(12):1261-1270. doi:10.1111/jcpp.12916PubMedGoogle ScholarCrossref
Luoma  JB, Martin  CE, Pearson  JL.  Contact with mental health and primary care providers before suicide: a review of the evidence.  Am J Psychiatry. 2002;159(6):909-916. doi:10.1176/appi.ajp.159.6.909PubMedGoogle ScholarCrossref
Ribeiro  JD, Gutierrez  PM, Joiner  TE,  et al.  Health care contact and suicide risk documentation prior to suicide death: results from the Army Study to Assess Risk and Resilience in Servicemembers (Army STARRS).  J Consult Clin Psychol. 2017;85(4):403-408. doi:10.1037/ccp0000178PubMedGoogle ScholarCrossref
Stone  DM, Simon  TR, Fowler  KA,  et al.  Vital signs: trends in state suicide rates, United States, 1999-2016 and circumstances contributing to suicide—27 states, 2015.  MMWR Morb Mortal Wkly Rep. 2018;67(22):617-624. doi:10.15585/mmwr.mm6722a1PubMedGoogle ScholarCrossref
Reis  BY, Kohane  IS, Mandl  KD.  Longitudinal histories as predictors of future diagnoses of domestic abuse: modelling study.  BMJ. 2009;339:b3677. doi:10.1136/bmj.b3677PubMedGoogle ScholarCrossref
Mandl  KD, Kohane  IS, McFadden  D,  et al.  Scalable collaborative infrastructure for a learning healthcare system (SCILHS): architecture.  J Am Med Inform Assoc. 2014;21(4):615-620. doi:10.1136/amiajnl-2014-002727PubMedGoogle ScholarCrossref
Kohane  IS, Churchill  SE, Murphy  SN.  A translational engine at the national scale: informatics for integrating biology and the bedside.  J Am Med Inform Assoc. 2012;19(2):181-185. doi:10.1136/amiajnl-2011-000492PubMedGoogle ScholarCrossref
Weber  GM, Murphy  SN, McMurry  AJ,  et al.  The Shared Health Research Information Network (SHRINE): a prototype federated query tool for clinical data repositories.  J Am Med Inform Assoc. 2009;16(5):624-630. doi:10.1197/jamia.M3191PubMedGoogle ScholarCrossref
R Foundation for Statistical Computing. R: a language and environment for statistical computing. Published 2013. Accessed February 19, 2020.
Castro  VM. Source code for extracting data from the ARCH network sites and running NBC model prediction. Published December 18, 2019. Accessed February 19, 2020.
Walsh  CG, Ribeiro  JD, Franklin  JC.  Predicting risk of suicide attempts over time through machine learning.  Clin Psychol Sci. 2017;5(3):457-469. doi:10.1177/2167702617691560Google ScholarCrossref
Kessler  RC, Warner  CH, Ivany  C,  et al; Army STARRS Collaborators.  Predicting suicides after psychiatric hospitalization in US Army soldiers: the Army Study To Assess Risk and Resilience in Servicemembers (Army STARRS).  JAMA Psychiatry. 2015;72(1):49-57. doi:10.1001/jamapsychiatry.2014.1754PubMedGoogle ScholarCrossref
McCoy  TH  Jr, Castro  VM, Roberson  AM, Snapper  LA, Perlis  RH.  Improving prediction of suicide and accidental death after discharge from general hospitals with natural language processing.  JAMA Psychiatry. 2016;73(10):1064-1071. doi:10.1001/jamapsychiatry.2016.2172PubMedGoogle ScholarCrossref
Smoller  JW.  The use of electronic health records for psychiatric phenotyping and genomics.  Am J Med Genet B Neuropsychiatr Genet. 2018;177(7):601-612. doi:10.1002/ajmg.b.32548PubMedGoogle ScholarCrossref
Perlis  RH, Iosifescu  DV, Castro  VM,  et al.  Using electronic medical records to enable large-scale studies in psychiatry: treatment resistant depression as a model.  Psychol Med. 2012;42(1):41-50. doi:10.1017/S0033291711000997PubMedGoogle ScholarCrossref
Castro  VM, Minnier  J, Murphy  SN,  et al; International Cohort Collection for Bipolar Disorder Consortium.  Validation of electronic health record phenotyping of bipolar disorder cases and controls.  Am J Psychiatry. 2015;172(4):363-372. doi:10.1176/appi.ajp.2014.14030423PubMedGoogle ScholarCrossref
Zhong  QY, Karlson  EW, Gelaye  B,  et al.  Screening pregnant women for suicidal behavior in electronic medical records: diagnostic codes vs. clinical notes processed by natural language processing.  BMC Med Inform Decis Mak. 2018;18(1):30. doi:10.1186/s12911-018-0617-7PubMedGoogle ScholarCrossref
Zhong  QY, Mittal  LP, Nathan  MD,  et al.  Use of natural language processing in electronic medical records to identify pregnant women with suicidal behavior: towards a solution to the complex classification problem.  Eur J Epidemiol. 2019;34(2):153-162. doi:10.1007/s10654-018-0470-0PubMedGoogle ScholarCrossref
Belsher  BE, Smolenski  DJ, Pruitt  LD,  et al.  Prediction models for suicide attempts and deaths: a systematic review and simulation.  JAMA Psychiatry. 2019;76(6):642-651. doi:10.1001/jamapsychiatry.2019.0174PubMedGoogle ScholarCrossref
Woodford  R, Spittal  MJ, Milner  A,  et al.  Accuracy of clinician predictions of future self-harm: a systematic review and meta-analysis of predictive studies.  Suicide Life Threat Behav. 2019;49(1):23-40. doi:10.1111/sltb.12395PubMedGoogle ScholarCrossref
Simon  GE, Shortreed  SM, Coley  RY.  Positive predictive values and potential success of suicide prediction models.  JAMA Psychiatry. 2019;76(8):868-869. doi:10.1001/jamapsychiatry.2019.1516PubMedGoogle ScholarCrossref
Mandel  JC, Kreda  DA, Mandl  KD, Kohane  IS, Ramoni  RB.  SMART on FHIR: a standards-based, interoperable apps platform for electronic health records.  J Am Med Inform Assoc. 2016;23(5):899-908. doi:10.1093/jamia/ocv189PubMedGoogle ScholarCrossref
Limit 200 characters
Limit 25 characters
Conflicts of Interest Disclosure

Identify all potential conflicts of interest that might be relevant to your comment.

Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.

Err on the side of full disclosure.

If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.

Not all submitted comments are published. Please see our commenting policy for details.

Limit 140 characters
Limit 3600 characters or approximately 600 words
    Original Investigation
    Health Informatics
    March 25, 2020

    Validation of an Electronic Health Record–Based Suicide Risk Prediction Modeling Approach Across Multiple Health Care Systems

    Author Affiliations
    • 1Computational Health Informatics Program, Boston Children’s Hospital, Boston, Massachusetts
    • 2Partners Research Information Science and Computing, Boston, Massachusetts
    • 3Department of Psychology, Harvard University, Cambridge, Massachusetts
    • 4Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
    • 5Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts
    • 6Department of Pediatrics, Boston Medical Center, Boston University School of Medicine, Boston, Massachusetts
    • 7School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston
    • 8McGovern Medical School, Division of General Internal Medicine, The University of Texas Health Science Center at Houston, Houston
    • 9Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts
    • 10Clinical and TranslationalScience Institute, Wake Forest School of Medicine, Winston-Salem, North Carolina
    • 11Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina
    JAMA Netw Open. 2020;3(3):e201262. doi:10.1001/jamanetworkopen.2020.1262
    Key Points español 中文 (chinese)

    Question  Can a process for training machine-learning algorithms based on electronic health records identify individuals at increased risk of suicide attempts across independent health care systems?

    Findings  In this prognostic study, using a supervised learning approach applied to structured electronic health record data from more than 3.7 million patients across 5 diverse US health care systems, models detected a mean of 38% of cases of suicide attempt with 90% specificity a mean of 2.1 years in advance.

    Meaning  These findings suggest that a computationally efficient machine-learning approach leveraging the full spectrum of structured electronic health record data may be able to detect the risk of suicidal behavior in unselected patients and may facilitate the development of clinical decision support tools that inform risk reduction interventions.


    Importance  Suicide is a leading cause of mortality, with suicide-related deaths increasing in recent years. Automated methods for individualized risk prediction have great potential to address this growing public health threat. To facilitate their adoption, they must first be validated across diverse health care settings.

    Objective  To evaluate the generalizability and cross-site performance of a risk prediction method using readily available structured data from electronic health records in predicting incident suicide attempts across multiple, independent, US health care systems.

    Design, Setting, and Participants  For this prognostic study, data were extracted from longitudinal electronic health record data comprising International Classification of Diseases, Ninth Revision diagnoses, laboratory test results, procedures codes, and medications for more than 3.7 million patients from 5 independent health care systems participating in the Accessible Research Commons for Health network. Across sites, 6 to 17 years’ worth of data were available, up to 2018. Outcomes were defined by International Classification of Diseases, Ninth Revision codes reflecting incident suicide attempts (with positive predictive value >0.70 according to expert clinician medical record review). Models were trained using naive Bayes classifiers in each of the 5 systems. Models were cross-validated in independent data sets at each site, and performance metrics were calculated. Data analysis was performed from November 2017 to August 2019.

    Main Outcomes and Measures  The primary outcome was suicide attempt as defined by a previously validated case definition using International Classification of Diseases, Ninth Revision codes. The accuracy and timeliness of the prediction were measured at each site.

    Results  Across the 5 health care systems, of the 3 714 105 patients (2 130 454 female [57.2%]) included in the analysis, 39 162 cases (1.1%) were identified. Predictive features varied by site but, as expected, the most common predictors reflected mental health conditions (eg, borderline personality disorder, with odds ratios of 8.1-12.9, and bipolar disorder, with odds ratios of 0.9-9.1) and substance use disorders (eg, drug withdrawal syndrome, with odds ratios of 7.0-12.9). Despite variation in geographical location, demographic characteristics, and population health characteristics, model performance was similar across sites, with areas under the curve ranging from 0.71 (95% CI, 0.70-0.72) to 0.76 (95% CI, 0.75-0.77). Across sites, at a specificity of 90%, the models detected a mean of 38% of cases a mean of 2.1 years in advance.

    Conclusions and Relevance  Across 5 diverse health care systems, a computationally efficient approach leveraging the full spectrum of structured electronic health record data was able to detect the risk of suicidal behavior in unselected patients. This approach could facilitate the development of clinical decision support tools that inform risk reduction interventions.