[Skip to Navigation]
Original Investigation
January 6, 2021

Identification of Suicide Attempt Risk Factors in a National US Survey Using Machine Learning

Author Affiliations
  • 1Department of Biostatistics, Columbia University, New York, New York
  • 2Division of Epidemiology, Services and Prevention Research, National Institute on Drug Abuse, Bethesda, Maryland
  • 3Department of Psychiatry, New York State Psychiatric Institute, Columbia University Medical Center, New York
JAMA Psychiatry. 2021;78(4):398-406. doi:10.1001/jamapsychiatry.2020.4165
Key Points

Question  Can survey data identify risk factors of nonfatal suicide attempt in the general population?

Findings  This study used a large, nationally representative longitudinal survey of US adults to create a suicide attempt model addressing risk factors of suicide. The most important factors included previous suicidal ideation or behavior, feeling downhearted, doing activities less carefully or accomplishing less because of emotional problems, younger age, lower educational achievement, and recent financial crisis.

Meaning  By using an algorithmic approach to analyze survey data and identify new risk factors, this study offers new avenues to guide future clinical assessment and development of suicide risk scales in the general population.


Importance  Because more than one-third of people making nonfatal suicide attempts do not receive mental health treatment, it is essential to extend suicide attempt risk factors beyond high-risk clinical populations to the general adult population.

Objective  To identify future suicide attempt risk factors in the general population using a data-driven machine learning approach including more than 2500 questions from a large, nationally representative survey of US adults.

Design, Setting, and Participants  Data came from wave 1 (2001 to 2002) and wave 2 (2004 to 2005) of the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). NESARC is a face-to-face longitudinal survey conducted with a national representative sample of noninstitutionalized civilian population 18 years and older in the US. The cumulative response rate across both waves was 70.2% resulting in 34 653 wave 2 interviews. A balanced random forest was trained using cross-validation to develop a suicide attempt risk model. Out-of-fold model prediction was used to assess model performance, including the area under the receiver operator curve, sensitivity, and specificity. Survey design and nonresponse weights allowed estimates to be representative of the US civilian population based on the 2000 census. Analyses were performed between May 15, 2019, and June 10, 2020.

Main Outcomes and Measures  Attempted suicide in the 3 years between wave 1 and wave 2 interviews.

Results  Of 34 653 participants, 20 089 were female (weighted proportion, 52.1%). The weighted mean (SD) age was 45.1 (17.3) years at wave 1 and 48.2 (17.3) years at wave 2. Attempted suicide during the 3 years between wave 1 and wave 2 interviews was self-reported by 222 of 34 653 participants (0.6%). Using survey questions measured at wave 1, the suicide attempt risk model yielded a cross-validated area under the receiver operator characteristic curve of 0.857 with a sensitivity of 85.3% (95% CI, 79.8-89.7) and a specificity of 73.3% (95% CI, 72.8-73.8) at an optimized threshold. The model identified 1.8% of the US population to be at a 10% or greater risk of suicide attempt. The most important risk factors were 3 questions about previous suicidal ideation or behavior; 3 items from the 12-Item Short Form Health Survey, namely feeling downhearted, doing activities less carefully, or accomplishing less because of emotional problems; younger age; lower educational achievement; and recent financial crisis.

Conclusions and Relevance  In this study, after searching through more than 2500 survey questions, several well-known risk factors of suicide attempt were confirmed, such as previous suicidal behaviors and ideation, and new risks were identified, including functional impairment resulting from mental disorders and socioeconomic disadvantage. These results may help guide future clinical assessment and the development of new suicide risk scales.

Limit 200 characters
Limit 25 characters
Conflicts of Interest Disclosure

Identify all potential conflicts of interest that might be relevant to your comment.

Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.

Err on the side of full disclosure.

If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.

Not all submitted comments are published. Please see our commenting policy for details.

Limit 140 characters
Limit 3600 characters or approximately 600 words