[Skip to Navigation]
Sign In
Figure 1.  Overall Architecture of Machine Learning Pipeline Framework to Estimate Weekly Suicide Fatalities
Overall Architecture of Machine Learning Pipeline Framework to Estimate Weekly Suicide Fatalities

In the first or intermediate phase, we identified the best model corresponding to each data source by training and validating a number of state-of-the-art machine learning models for each data source. The predicted estimates of weekly suicide fatalities given by a single data source were then combined in an automatic and harmonic way via a neural network model in the ensemble (second) phase. Health services include the National Suicide Prevention Lifeline (Lifeline), US poison control center calls (Poison), and emergency department (ED) visits.

Figure 2.  Estimation Results
Estimation Results

Weekly number of suicide fatalities estimated by the best performing historical suicide fatalities baseline model (A), which consistently underestimates weekly suicides, and the final ensemble model that combines all data sources (B). The gray areas represent 95% CIs.

Table 1.  Performance of Individual Models Built Using Each Data Source at the Intermediate Stage (First Phase)
Performance of Individual Models Built Using Each Data Source at the Intermediate Stage (First Phase)
Table 2.  Performance of the 6 Ensemble Models Built Using Different Combinations of the Data Sources
Performance of the 6 Ensemble Models Built Using Different Combinations of the Data Sources
1.
World Health Organization. World health statistics 2019: monitoring health for the SDGs, sustainable development goals. Published May 21, 2019. Accessed January 21, 2020. https://www.who.int/publications/i/item/world-health-statistics-2019-monitoring-health-for-the-sdgs-sustainable-development-goals
2.
Centers for Disease Control and Prevention. National Vital Statistics System. Death rates for selected causes by 10-year age groups, race, and sex: death registration states, 1900-32, and United States, 1933-98. Published June 24, 2019. Accessed September 4, 2019. https://wonder.cdc.gov/
3.
Hedegaard  H, Curtin  SC, Warner  M.  Increase in suicide mortality in the United States, 1999.   NCHS Data Brief. 2018;(362):2020.PubMedGoogle Scholar
4.
Spencer M, Ahmad F. Timeliness of death certificate data for mortality surveillance and provisional estimates. Report 001. Published December 2016. Accessed January 31, 2020. https://www.cdc.gov/nchs/data/vsrr/report001.pdf
5.
Centers for Disease Control and Prevention, National Center for Injury Prevention and Control. WISQARS: Web-based Injury Statistics Query and Reporting System. Reviewed July 1, 2020. Accessed September 4, 2019. https://www.cdc.gov/injury/wisqars/index.html
6.
Hanzlick  R.  Medical examiners, coroners, and public health: a review and update.   Arch Pathol Lab Med. 2006;130(9):1274-1282.PubMedGoogle Scholar
7.
National Research Council.  Strengthening Forensic Science in the United States: A Path Forward. National Academies Press; 2009.
8.
Stone  DM, Holland  KM, Bartholow  B,  et al.  Deciphering suicide and other manners of death associated with drug intoxication: a Centers for Disease Control and Prevention consultation meeting summary.   Am J Public Health. 2017;107(8):1233-1239. doi:10.2105/AJPH.2017.303863 PubMedGoogle ScholarCrossref
9.
Stone D, Holland K, Bartholow B, Crosby A, Davis S, Wilkins N. Preventing suicide: a technical package of policies, programs, and practice. Published 2017. Accessed January 31, 2020. https://www.cdc.gov/violenceprevention/pdf/suicideTechnicalPackage.pdf
10.
Durkheim E; Spaulding JA, Simpson G, trans.  Suicide: A Study in Sociology. Free Press; 1951.
11.
Won  HH, Myung  W, Song  GY,  et al.  Predicting national suicide numbers with social media data.   PLoS One. 2013;8(4):e61809. doi:10.1371/journal.pone.0061809 PubMedGoogle Scholar
12.
Jashinsky  J, Burton  SH, Hanson  CL,  et al.  Tracking suicide risk factors through Twitter in the US.   Crisis. 2014;35(1):51-59. doi:10.1027/0227-5910/a000234 PubMedGoogle ScholarCrossref
13.
Coppersmith GA, Harman CT, Dredze MH. Measuring post traumatic stress disorder in Twitter. International Conference on Weblogs and Social Media. Published 2014. Accessed January 31, 2020. https://www.qntfy.com/static/papers/ptsd_in_twitter.pdf
14.
O’Dea B, Wan S, Batterham PJ, Calear AL, Paris C, Christensen H.  Detecting suicidality on Twitter.   Internet Interventions. 2015;2(2):183-188. doi:10.1016/j.invent.2015.03.005Google ScholarCrossref
15.
Amir S, Dredze M, Ayers JW. Population level mental health surveillance over social media with digital cohorts. NAACL Workshop on Computational Linguistics and Clinical Psychology. Published 2019. Accessed January 31, 2020. https://pdfs.semanticscholar.org/0e40/f5c3f67912d78b3a38396c5692b7e5ba1220.pdf
16.
De Choudhury M, Gamon M, Counts S, Horvitz E. Predicting depression via social media. Seventh International AAAI Conference on Weblogs and Social Media. Published 2013. Accessed January 31, 2020. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/icwsm_13.pdf
17.
De Choudhury M, Counts S, Horvitz E. Social media as a measurement tool of depression in populations. In: Proceedings of the 5th Annual ACM Web Science Conference; 2013:47-56.
18.
De Choudhury M, Kiciman E, Dredze M, Coppersmith G, Kumar M. Discovering shifts to suicidal ideation from mental health content in social media. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 2016:2098-2110.
19.
Burnap P, Colombo W, Scourfield J. Machine classification and analysis of suicide-related communication on Twitter. In: Proceedings of the 26th ACM Conference on Hypertext and Social Media. 2015:75-84.
20.
Robinson  J, Cox  G, Bailey  E,  et al.  Social media and suicide prevention: a systematic review.   Early Interv Psychiatry. 2016;10(2):103-121. doi:10.1111/eip.12229 PubMedGoogle ScholarCrossref
21.
Reece  AG, Danforth  CM.  Instagram photos reveal predictive markers of depression.   EPJ Data Science. 2017;6(15). doi:10.1140/epjds/s13688-017-0110-zGoogle Scholar
22.
Coppersmith  G, Leary  R, Crutchley  P, Fine  A.  Natural language processing of social media as screening for suicide risk.   Biomed Inform Insights. 2018;10:1178222618792860. doi:10.1177/1178222618792860 PubMedGoogle Scholar
23.
Eichstaedt  JC, Smith  RJ, Merchant  RM,  et al.  Facebook language predicts depression in medical records.   Proc Natl Acad Sci U S A. 2018;115(44):11203-11208. doi:10.1073/pnas.1802331115 PubMedGoogle ScholarCrossref
24.
Birnbaum  ML, Ernala  SK, Rizvi  AF,  et al.  Detecting relapse in youth with psychotic disorders utilizing patient-generated and patient-contributed digital data from Facebook.   NPJ Schizophr. 2019;5(1):17. doi:10.1038/s41537-019-0085-9 PubMedGoogle ScholarCrossref
25.
Olteanu A, Castillo C, Diaz F, Kiciman E.  Social data: biases, methodological pitfalls, and ethical boundaries.   Frontiers in Big Data. 2019;2:13. doi:10.3389/fdata.2019.00013Google ScholarCrossref
26.
Ruths  D, Pfeffer  J.  Social sciences: social media for large studies of behavior.   Science. 2014;346(6213):1063-1064. doi:10.1126/science.346.6213.1063 PubMedGoogle ScholarCrossref
27.
Tufekci  Z. Big questions for social media big data: representativeness, validity and other methodological pitfalls. In: Proceedings of the 8th International AAAI Conference on Weblogs and Social Media; 2014.
28.
Danah Boyd and Kate Crawford.  Critical questions for big data: provocations for a cultural, technological, and scholarly phenomenon.   Inf Commun Society. 2012;15(5):662-679. doi:10.1080/1369118X.2012.678878Google ScholarCrossref
29.
Yang  AC, Tsai  SJ, Huang  NE, Peng  CK.  Association of internet search trends with suicide death in Taipei City, Taiwan, 2004-2009.   J Affect Disord. 2011;132(1-2):179-184. doi:10.1016/j.jad.2011.01.019 PubMedGoogle ScholarCrossref
30.
Arora  VS, Stuckler  D, McKee  M.  Tracking search engine queries for suicide in the United Kingdom, 2004-2013.   Public Health. 2016;137:147-153. doi:10.1016/j.puhe.2015.10.015 PubMedGoogle ScholarCrossref
31.
McCarthy  MJ.  Internet monitoring of suicide risk in the population.   J Affect Disord. 2010;122(3):277-279. doi:10.1016/j.jad.2009.08.015 PubMedGoogle ScholarCrossref
32.
Lazer  D, Kennedy  R, King  G, Vespignani  A.  Big data: the parable of Google flu: traps in big data analysis.   Science. 2014;343(6176):1203-1205. doi:10.1126/science.1248506 PubMedGoogle ScholarCrossref
33.
Olson  DR, Konty  KJ, Paladini  M, Viboud  C, Simonsen  L.  Reassessing Google Flu Trends data for detection of seasonal and pandemic influenza: a comparative epidemiological study at three geographic scales.   PLoS Comput Biol. 2013;9(10):e1003256. doi:10.1371/journal.pcbi.1003256 PubMedGoogle Scholar
34.
Lee  KS, Lee  H, Myung  W,  et al.  Advanced daily prediction model for national suicide numbers with social media data.   Psychiatry Investig. 2018;15(4):344-354. doi:10.30773/pi.2017.10.15 PubMedGoogle ScholarCrossref
35.
Kessler  RC, Borges  G, Walters  EE.  Prevalence of and risk factors for lifetime suicide attempts in the National Comorbidity Survey.   Arch Gen Psychiatry. 1999;56(7):617-626. doi:10.1001/archpsyc.56.7.617 PubMedGoogle ScholarCrossref
36.
Reeves  A, Stuckler  D, McKee  M, Gunnell  D, Chang  SS, Basu  S.  Increase in state suicide rates in the USA during economic recession.   Lancet. 2012;380(9856):1813-1814. doi:10.1016/S0140-6736(12)61910-2 PubMedGoogle ScholarCrossref
37.
Chou  WY, Hunt  YM, Beckjord  EB, Moser  RP, Hesse  BW.  Social media use in the United States: implications for health communication.   J Med internet Res. 2009;11(4):e48. doi:10.2196/jmir.1249 PubMedGoogle Scholar
38.
De Choudhury  M, Morris  MR, White  RW. Seeking and sharing health information online: comparing search engines and social media. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems; 2014:1365-1376.
39.
Zwald ML, Holland KM, Annor FB, et al.  Syndromic surveillance of suicidal ideation and self-directed violence—United States, January 2017–December 2018.   Morbidity and Mortality Weekly Report. 2020;69(4):103. doi:10.15585/mmwr.mm6904a3 PubMedGoogle ScholarCrossref
40.
National Syndromic Surveillance Program (NSSP). Sending early warning signals from emergency departments to public health. Reviewed August 12, 2020. Accessed September 4, 2019. https://www.cdc.gov/nssp/index.html
41.
National Suicide Prevention Lifeline. Published December 6, 2004. Accessed September 4, 2019. https://suicidepreventionlifeline.org/
42.
American Association of Poison Control Centers. National Poison Data System (NPDS). Accessed January 31, 2020. https://aapcc.org/national-poison-data-system
43.
Federal Reserve Bank of St. Louis. FRED: economic research. Published 2014. Accessed September 4, 2019. https://fred.stlouisfed.org/
44.
Sunrise and Sunset API. Published 1991. Accessed September 4, 2019. https://sunrise-sunset.org/api
45.
Chu  C-SJ.  Time series segmentation: a sliding window approach.   Inf Sci. 1995;85(1):147-173. doi:10.1016/0020-0255(95)00021-GGoogle ScholarCrossref
46.
Zou  H, Hastie  T.  Regularization and variable selection via the elastic net.   J R Stat Soc B. 2005;67:301-320. doi:10.1111/j.1467-9868.2005.00503.xGoogle ScholarCrossref
47.
Smola  AJ, Scholkopf  B.  A tutorial on support vector regression.   Stat Comput. 2004;14(3):199-222. doi:10.1023/B:STCO.0000035301.49549.88Google ScholarCrossref
48.
van der Laan  MJ, Polley  EC, Hubbard  AE. Super learner.  Stat Applic Genet Molecular Biol. Published online September 16, 2007. doi:10.2202/1544-6115.1309PubMed
49.
Dugas  AF, Jalalpour  M, Gel  Y,  et al.  Influenza forecasting with Google flu trends.   PLoS One. 2013;8(2):e56176. doi:10.1371/journal.pone.0056176PubMedGoogle Scholar
Original Investigation
Health Informatics
December 23, 2020

Development of a Machine Learning Model Using Multiple, Heterogeneous Data Sources to Estimate Weekly US Suicide Fatalities

Author Affiliations
  • 1Department of Computer Science and Engineering, Incheon National University, Incheon, South Korea
  • 2Office of Strategy and Innovation, National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, Atlanta, Georgia
  • 3Division of Violence Prevention, National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, Atlanta, Georgia
  • 4National Suicide Prevention Lifeline, New York, New York
  • 5National Center for Environmental Health, Centers for Disease Control and Prevention, Atlanta, Georgia
  • 6School of Interactive Computing, Georgia Institute of Technology, Atlanta
JAMA Netw Open. 2020;3(12):e2030932. doi:10.1001/jamanetworkopen.2020.30932
Key Points

Question  Can real-time streams of secondary information related to suicide be used to accurately estimate suicide fatalities in the US in real time?

Findings  In this national cross-sectional study, combining information from 8 data streams encompassing various health services and online data sources enabled accurate, real-time estimation of US suicide fatalities with meaningful correlation to week-to-week epidemiological trends and a less than 1% error compared with actual counts.

Meaning  These findings advance the first efforts to create a population-level system for enabling real-time epidemiological trend monitoring of suicide fatalities.

Abstract

Importance  Suicide is a leading cause of death in the US. However, official national statistics on suicide rates are delayed by 1 to 2 years, hampering evidence-based public health planning and decision-making.

Objective  To estimate weekly suicide fatalities in the US in near real time.

Design, Setting, and Participants  This cross-sectional national study used a machine learning pipeline to combine signals from several streams of real-time information to estimate weekly suicide fatalities in the US in near real time. This 2-phase approach first fits optimal machine learning models to each individual data stream and subsequently combines predictions made from each data stream via an artificial neural network. National-level US administrative data on suicide deaths, health services, and economic, meteorological, and online data were variously obtained from 2014 to 2017. Data were analyzed from January 1, 2014, to December 31, 2017.

Exposures  Longitudinal data on suicide-related exposures were obtained from multiple, heterogeneous streams: emergency department visits for suicide ideation and attempts collected via the National Syndromic Surveillance Program (2015-2017); calls to the National Suicide Prevention Lifeline (2014-2017); calls to US poison control centers for intentional self-harm (2014-2017); consumer price index and seasonality-adjusted unemployment rate, hourly earnings, home price index, and 3-month and 10-year yield curves from the Federal Reserve Economic Data (2014-2017); weekly daylight hours (2014-2017); Google and YouTube search trends related to suicide (2014-2017); and public posts on suicide on Reddit (2 314 533 posts), Twitter (9 327 472 tweets; 2015-2017), and Tumblr (1 670 378 posts; 2014-2017).

Main Outcomes and Measures  Weekly estimates of suicide fatalities in the US were obtained through a machine learning pipeline that integrated the above data sources. Estimates were compared statistically with actual fatalities recorded by the National Vital Statistics System.

Results  Combining information from multiple data streams, the machine learning method yielded estimates of weekly suicide deaths with high correlation to actual counts and trends (Pearson correlation, 0.811; P < .001), while estimating annual suicide rates with low error (0.55%).

Conclusions and Relevance  The proposed ensemble machine learning framework reduces the error for annual suicide rate estimation to less than one-tenth of that of current forecasting approaches that use only historical information on suicide deaths. These findings establish a novel approach for tracking suicide fatalities in near real time and provide the potential for an effective public health response such as supporting budgetary decisions or deploying interventions.

Introduction

Suicide is a significant contributor to global mortality, resulting in nearly 800 000 deaths worldwide each year.1 Currently, suicide rates in the US are at their highest levels in more than 50 years after having experienced a relatively rapid, recent increase of 35% from 1999 to 2018 alone.2,3 Despite the urgency of this public health problem, real-time information on suicide fatality trends to guide prevention efforts is lacking. This is because national statistics on suicide rates are delayed by 1 to 2 years.4,5 Official statistics on suicide fatalities are produced by the Centers for Disease Control and Prevention using data from the National Vital Statistics System. The National Vital Statistics System collects information from death certificates that are submitted by states to the Centers for Disease Control and Prevention from more than 2000 medical examiner and coroner offices in the US.6 Reasons for the delay in suicide fatality data include the decentralized nature of mortality statistics, critical shortages of workers such as forensic pathologists, variation in local information technology infrastructure, and the time needed to investigate deaths due to suicide.4,7,8

Lagged mortality data presents several challenges to public health. First, it prevents federal agencies from making budgetary decisions that are accurately matched to the current magnitude of the problem. In addition, without the ability to detect rapid changes in mortality trends, a timely and coordinated national-level response to support and implement prevention programs and policies, such as those that strengthen economic supports, increase access to care and interventions, enhance social connectedness, and improve identification of those at risk, among other strategies described in Centers for Disease Control and Prevention’s technical package for suicide prevention,9 are not possible. Thus, there exists a significant need to develop and test the feasibility, accuracy, and applicability of novel methods that can provide a more real-time understanding of suicide trends to inform public health activities and budgetary decision-making.

To date, limited work has evaluated and validated approaches to produce more real-time estimates of mental illness or suicide. Owing to the social and environmental underpinnings of suicide,10 some of the earliest efforts to examine real-time trends related to suicide focused on using signals from large-scale social media data, such as Twitter.11,12 In recent years, social media data have been backed by a rich literature demonstrating its potential in assessing and inferring a wide variety of mental health attributes and outcomes.13-24 Notwithstanding this evidence, the potential of social media data in supporting public health efforts for population-level suicide assessment has yet to be fully harnessed. Moreover, methods that use social media data alone can be limited by issues of proxy and sampling bias,25 a lack of demographic and geographic representativeness with respect to the general population,26 structural idiosyncrasies owing to different platforms’ distinct affordances,27 and epistemological issues around what social media–derived signals really mean when taken out of context.28

Other online data sources, such as Google search trends, have also been evaluated for use in tracking suicide trends, and findings have revealed mixed results depending on the method used; studies have generally only used cross-sectional approaches and have not tested such data for prospective prediction tasks.29-31 Furthermore, public health use of Google trends as a stand-alone data source has come under criticism in recent years, following unanticipated challenges in estimating prevalence of influenzalike illness.32

Consequently, researchers have suggested that greater value can be obtained by combining online data with other near–real-time health data to facilitate better public health monitoring.32,33 To this end, researchers have attempted to strengthen and supplement the signal produced from online search or social media data by examining variables that provide additional population-level environmental context for suicide risk, such as macroeconomic indicators and meteorological patterns.11,34 To our knowledge, no study has attempted to combine information from disparate real-time data sources to evaluate the ability of multiple streams to tackle the pressing public health need of enabling estimation of suicide fatality rates for the US in near real time.

In this study, we aim to evaluate the individual and combined ability of several disparate categories of real-time health services and economic, meteorological, and online data sources to produce weekly estimates of the number of suicide fatalities in the US. Such an ensemble approach will likely allow countering the underlying biases (eg, sampling bias or demographic/geographic misrepresentation idiosyncrasies) of each of the individual data sources, because suicidal outcomes are multifaceted and subpopulations are likely to be represented and stratified differentially in each data source.

Methods
Data Sources

For this cross-sectional study, we started with a comprehensive set of data sources spanning the categories above, which were selected in a theoretical and domain-inspired way and from prior literature.35-38 Because these were public and/or deidentified data, the study did not constitute human subjects research, and therefore the ethical review board of Georgia Institute of Technology considered the research exempt. This cross-sectional study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline.

First, our work harnessed an underexplored class of real-time data that can provide insight into suicide epidemiology: administrative data generated from the provision of clinical or behavioral support services.39 The 3 health services data sources we used were (1) weekly proportion of emergency department visits for suicide ideation or attempt as documented in electronic health record data from the National Syndromic Surveillance Program (2015-2017)40; (2) weekly volume of calls made to the National Suicide Prevention Lifeline, a national telephone hotline for mental health crises (2014-2017)41; and (3) weekly proportion of calls to US poison control centers as captured in the National Poison Data System for an exposure attributed to intentional self-harm of all poison control center exposure calls (2014-2017).42 Although these data sources have been used to support traditional epidemiological studies, such data have thus far not been evaluated for suicide prediction tasks.

Second, for economic data, we considered several indicators available with monthly frequency from 2014 to 2017 through Federal Reserve Economic Data, including consumer price index and seasonality-adjusted unemployment rate, overall hourly earnings, hourly earnings in manufacturing, home price index, and 3-month and 10-year yield curves.43 Third, we assessed a key meteorological indicator, the duration of daylight hours (mean number for each week; 2014-2017) per prior work.11,44 These data sources are not discussed in detail herein because they either lack temporal immediacy to be actionable in weekly estimation of fatalities or do not capture variability well in the magnitude of fatalities; therefore, they were found to not contribute meaningfully to improved model performance. Details are given in eMethods 1 and 3 in the Supplement.

Fourth, we used online data, which included Google trends data (weekly normalized popularity scores of searches spanning 42 suicide-related terms on the Google search engine; 2014-2017), YouTube trends data (weekly normalized popularity scores of searches spanning 42 suicide-related terms on YouTube; 2014-2017), public Twitter data (weekly number of Twitter posts containing any of 38 suicide-related keywords, phrases, and hashtags [9 327 472 tweets]; 2015-2017;), public Reddit data (weekly normalized fraction of posts shared on 55 suicide- and mental health–related subcommunities on Reddit/subreddits [2 314 533 posts]; 2014-2017), and public Tumblr data (weekly normalized fraction of posts related to 42 suicide-related key words, phrases, and hashtags [1 670 378 posts]; 2014-2017). To normalize the suicide-related data corresponding to each of these sources, the Google and YouTube trends data were restricted to queries made in the US; however, for Reddit, Twitter, and Tumblr, we did not impose this constraint, because country-specific use statistics for these platforms are not available. We do not consider this to be a significant issue, because most of the users in these social media platforms are from the US. Detailed information on the acquisition and processing of each data source, including their descriptive statistics, can be found in eMethods 1 and eTables 1-3 in the Supplement. The time-series values of the individual data sets that are included and excluded in the proposed model are plotted in eFigures 3 and 4 in the Supplement, respectively.

Our primary outcome of interest was weekly counts of suicide fatalities in the US. These data were available from 2010 through 2017. Suicide deaths were identified from death certificate data from the National Vital Statistics System using the International Statistical Classification of Diseases and Related Health Problems, Tenth Revision, underlying cause of death codes U03, X60-X84, and Y87.0.

Machine Learning Approach

We developed a novel 2-phase machine learning pipeline to leverage these data sets for near–real-time estimation of weekly suicide fatalities (Figure 1). In the first phase, we developed the best model corresponding to each data source. We set aside weekly suicide fatalities data from 2017 as our held-out test data for evaluating final performance, 2016 fatalities data for validation, and data from 2015 and before (if available) for training the prediction models. Predictive features for training, validation, and testing for each of the clinical and online data sources were constructed using a lagged (value at a previous time step or previous week) sliding-window approach,45 wherein we used weekly data from the various data sources as features for the modeling/estimation task of suicide fatalities in the same week, because these data sets can be gathered in near real time. In addition, we leveraged historical data on weekly suicide fatalities as an additional data source for our estimation task, both to augment the estimates given by the real-time signals as well as to develop a baseline model for comparison. We trained and validated a number of leading machine learning models: linear regression, LASSO (least absolute shrinkage and selection operator), ridge, elastic net,46 random forest regression, and support vector regression.47 In the second phase, using an ensemble machine learning modeling approach, we combined the model outputs (the best predicted estimates of weekly suicide fatalities given by a single data source) from the first phase in an automatic and harmonic way via a neural network model to obtain the final estimates. The rationale of this approach is in line with super learner algorithm,48 which finds the best combination among different models. eMethods 2, eFigure 1, and eTable 4 in the Supplement provide detailed information of this machine learning model development approach.

Results

Table 1 shows the performance of the individual models on the held-out 2017 test data at the conclusion of the first stage (see eMethods 3 and eTable 6 in the Supplement for extended results). We do not include performance of the models built using the economic and meteorological data sets herein, because in the ensemble step, these data sources degraded performance; therefore, we excluded them in the subsequent discussion (see eMethods 3 and eTable 5 in the Supplement).

In addition, as a baseline, we also report the results from the best model trained using only historical suicide fatality data (as baseline models, we considered linear regression, support vector regressor, and Holt-Winters models to factor in seasonality). Although the baseline model has the highest Pearson correlation coefficient (0.761; P < .001) of all individual models owing to the strong seasonality of suicide (the maximum difference of correlation coefficient is 0.372), such a model built using only lagged historical data cannot detect real-time changes in suicide rates and significantly underestimated the growth in suicide rates in 2017.

Examining the other data sources, we found each individual data set to have distinct strengths and weaknesses (eTable 7 in the Supplement) in (1) tracking seasonality (week-to-week) trends as measured by the Pearson correlation coefficient; (2) minimizing the error of weekly estimates (as measured by root mean squared error [RMSE], mean absolute percentage error [MAPE], and symmetric MAPE [SMAPE]); and (3) estimating the total number of suicide fatalities during an entire year as measured by the percentage error in the annual estimated rate of suicide per 100 000 people. For example, the Pearson correlation coefficients for predictions made from both the poison and Google data with actual weekly suicide fatality counts were high (0.702 and 0.721, respectively; P < .001 for both); however, RMSE and MAPE for these data sources are higher than some other sources (the maximum differences of RMSE and MAPE are between poison control center calls and emergency department visits, which are 107.666 and 12.209, respectively), indicating higher variance week to week. Conversely, although the emergency department visit data show a slightly more modest Pearson correlation coefficient (0.511; P < .001), mean weekly percentage error estimates are low (MAPE, 4.894%). These results show that each individual data source can contribute unique and complementary advantages to a model estimating suicide fatalities.

Table 2 shows ensemble models combining predictions via a neural network model. As expected, all of the ensembles improve over models that use individual data sources alone; the Pearson correlation coefficient rises to 0.811 (P < .001) for the all data sources ensemble with a corresponding error of only 0.55% in estimating the annual rate of suicide fatalities, from a maximum Pearson correlation coefficient of 0.761 (and a corresponding error of 7.80%) in the case of the individual models. The all data sources ensemble outperforms predictions made from the best performing baseline model (increase by 0.05 in terms of correlation coefficient and decrease by 35.351 and 7.25 in terms of RMSE and error percentage of annual estimated rate, respectively; P < .001), which makes estimates from historical suicide fatality data. Specifically, the all data sources ensemble model improves the correlation while also reducing error around weekly estimates by approximately half and the error for the annual estimate to less than one-tenth of that of the baseline model.

It is also important to note that although the ensemble of the health services data sources outperforms the online data sources ensemble in terms of correlation coefficient, the ensembled online data has a relatively low percentage error when estimating suicide fatalities. The Pearson correlation coefficient and percentage error of annual suicide rate of health services data sources are 0.802 and 4.14% (P < .001), respectively; those of online data sources are 0.633 and 2.49% (P < .001), respectively. The ensemble of combining these sources helps to reduce variance of the all data sources ensemble (difference of 3.59% of error rate from health services data sources).

Finally, the weekly counts of predicted (or estimated) suicide fatalities are plotted alongside actual weekly suicide fatalities, as shown in Figure 2. Additional plots showing the estimated values of suicide fatalities by the other ensembles can be found in eFigure 2 in the Supplement. The baseline univariate time series model underestimated suicide counts in our test year, meaning that suicides increased at a rate greater than that observed in prior years, but this was not learned by the baseline model. Our final ensemble model that used near–real-time data, on the contrary, not only followed the trend but also gave estimated values of suicide fatalities that are much closer to the actual values.

Discussion
Implications

Our work bears several promising implications for public health. First, as mentioned above, there is currently no established way to gather real-time national information on suicide trends, which is essential for timely suicide prevention efforts. Lagged data often do not reflect unanticipated shifts, in particular, the rise in suicide fatalities in recent years. By combining information from both online, big-data sources and more traditional health data sources, we are able to achieve a fairly accurate estimation of suicide fatalities in a near–real-time fashion that is less prone to the underlying biases, idiosyncrasies, and unique characteristics of any single data source.

Second, although combining multiple streams of data via machine learning models to produce disease forecasts has emerged as a leading quantitative approach in the study of influenzalike illness and a limited number of infectious diseases,49 it has been unclear whether such approaches are translatable to noninfectious diseases such as suicide. For example, the lag between when historical data are available and when estimates are being made is often much greater for noninfectious diseases (ie, criterion-standard influenza estimates are delayed by 1-2 weeks, whereas criterion-standard estimates of suicide are delayed by 1 or more years, as noted above). Our work demonstrates a robust quantitative approach as well as elucidates novel data sources where criterion-standard historical data lag by 1 year, suggesting promise for real-time estimation of such noninfectious diseases.

Third, unlike some limited prior efforts that have attempted to combine data sets for suicide risk estimation,11 our framework uses ensemble modeling that allows us to go beyond simplistic time series forecasting approaches of integrating disparate data sets (eg, in linear combinations). Our machine learning framework also uses state-of-the-art machine learning methods, such as neural networks, which are able to glean those patterns (or features) embedded in the data that may otherwise be latent or not apparent in linear or polynomial regression models. Essentially, using these techniques makes our pipeline highly flexible—should new, appropriate data streams become available or existing ones cease to be useful—as well as amenable to audits and deployment.

Finally, development and testing of our suicide fatalities estimation model mirrored real-world constraints in public health monitoring and surveillance of suicide. For this reason, we chose the Pearson correlation coefficient as a metric to choose the best model at the intermediate stage, and then for reporting the eventual best models. This is because policy makers or public health stakeholders focus on projected increases and decreases in suicide fatalities for decision-making, and these trends or patterns therefore constitute more informative and actionable signals rather than metrics that optimize for overall best means (such as RMSE). Finally, to stakeholders who are nonexperts at machine learning, correlation coefficients can be more interpretable compared with metrics such as MAPE. That said, our work highlights the tradeoffs between different performance metrics in the final set of ensemble models, which may serve as helpful resources and considerations for real-world use.

Limitations

Despite the implications described above, this work has some limitations. First, caution is needed when generalizing data sources and estimation results to other countries, to smaller geographic units, or among specific sociodemographic groups. Next, because the use patterns and norms of social media and relevant key words change over time, models from social media or web services may be less helpful to estimate suicide fatalities in years beyond 2017, and predictive signals from newer social media platforms may be more important going forward. Consequently, real-time, real-world deployment will require periodic retraining and incorporation of new and changing data sources that emerge as leading proxy signals for mental health.

Conclusions

Our use of multiple novel data sources with a 2-stage machine learning pipeline involving use of a neural network to identify and combine features demonstrated excellent performance. Indeed, the performance improvement beyond current standard practice is considerable. The historical fatalities model, which uses only historical data to estimate future trends and which represents contemporary public health practice, would have resulted in an overall error of 7.80% if applied to estimate suicides throughout 2017, whereas the ensemble approach we present would have resulted in a 0.55% error. This rate represents a multifold improvement beyond current modeling practice. Risk factors for and drivers of suicide rates are multifactorial, which necessitates a coordinated and comprehensive public health approach. Examining suicide from the perspective of multiple data sources, each representing a unique aspect of the problem, can help inform federal support of appropriate programs and policies to prevent suicide. For example, more timely information about rapidly increasing suicide trends could enable governmental funding and support for programs to prevent suicide in a way that better keeps pace with the growing magnitude of the problem. Such efforts might include more rapidly addressing clinician shortages in mental health care; expanding crisis intervention programs, such as hotline, chat, or text services; or strengthening policies and programs that address underlying risk factors, such as economic or housing instability.9 This work advances the very first efforts in achieving a near real-time understanding of suicide in the United States.

Back to top
Article Information

Accepted for Publication: November 2, 2020.

Published: December 23, 2020. doi:10.1001/jamanetworkopen.2020.30932

Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2020 Choi D et al. JAMA Network Open.

Corresponding Author: Munmun De Choudhury, PhD, School of Interactive Computing, Georgia Institute of Technology, 756 W Peachtree St NW, Atlanta, GA 30308 (munmund@gatech.edu).

Author Contributions: Drs Choi and De Choudhury had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Concept and design: Choi, Sumner, Holland, Bowen, Taylor, De Choudhury.

Acquisition, analysis, or interpretation of data: Choi, Sumner, Holland, Draper, Murphy, Bowen, Zwald, Wang, Law, Taylor, Konjeti.

Drafting of the manuscript: Choi, Sumner, Holland, De Choudhury.

Critical revision of the manuscript for important intellectual content: Sumner, Holland, Draper, Murphy, Bowen, Zwald, Wang, Law, Taylor, Konjeti, De Choudhury.

Statistical analysis: Choi, Sumner, Bowen, Wang, Law, De Choudhury.

Obtained funding: Sumner, De Choudhury.

Administrative, technical, or material support: Choi, Sumner, Holland, Draper, Taylor, De Choudhury.

Supervision: Sumner, Holland, Bowen, Law, De Choudhury.

Conflict of Interest Disclosures: Dr De Choudhury reported receiving grants from Centers for Disease Control and Prevention (CDC) during the conduct of the study and grants from the National Science Foundation, National Institutes of Health, Intelligence Advanced Research Projects Activity, Microsoft Corporation, Mozilla Corporation, Everytown for Gun Safety, and Northwell Health outside the submitted work. No other disclosures were reported.

Funding/Support: This study was supported through purchase order 75D30118P01967 from the CDC via the Department of Health and Human Services (Georgia Institute of Technology; principal investigator, Dr De Choudhury).

Role of the Funder/Sponsor: Investigators from the CDC participated in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Disclaimer: The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the CDC.

References
1.
World Health Organization. World health statistics 2019: monitoring health for the SDGs, sustainable development goals. Published May 21, 2019. Accessed January 21, 2020. https://www.who.int/publications/i/item/world-health-statistics-2019-monitoring-health-for-the-sdgs-sustainable-development-goals
2.
Centers for Disease Control and Prevention. National Vital Statistics System. Death rates for selected causes by 10-year age groups, race, and sex: death registration states, 1900-32, and United States, 1933-98. Published June 24, 2019. Accessed September 4, 2019. https://wonder.cdc.gov/
3.
Hedegaard  H, Curtin  SC, Warner  M.  Increase in suicide mortality in the United States, 1999.   NCHS Data Brief. 2018;(362):2020.PubMedGoogle Scholar
4.
Spencer M, Ahmad F. Timeliness of death certificate data for mortality surveillance and provisional estimates. Report 001. Published December 2016. Accessed January 31, 2020. https://www.cdc.gov/nchs/data/vsrr/report001.pdf
5.
Centers for Disease Control and Prevention, National Center for Injury Prevention and Control. WISQARS: Web-based Injury Statistics Query and Reporting System. Reviewed July 1, 2020. Accessed September 4, 2019. https://www.cdc.gov/injury/wisqars/index.html
6.
Hanzlick  R.  Medical examiners, coroners, and public health: a review and update.   Arch Pathol Lab Med. 2006;130(9):1274-1282.PubMedGoogle Scholar
7.
National Research Council.  Strengthening Forensic Science in the United States: A Path Forward. National Academies Press; 2009.
8.
Stone  DM, Holland  KM, Bartholow  B,  et al.  Deciphering suicide and other manners of death associated with drug intoxication: a Centers for Disease Control and Prevention consultation meeting summary.   Am J Public Health. 2017;107(8):1233-1239. doi:10.2105/AJPH.2017.303863 PubMedGoogle ScholarCrossref
9.
Stone D, Holland K, Bartholow B, Crosby A, Davis S, Wilkins N. Preventing suicide: a technical package of policies, programs, and practice. Published 2017. Accessed January 31, 2020. https://www.cdc.gov/violenceprevention/pdf/suicideTechnicalPackage.pdf
10.
Durkheim E; Spaulding JA, Simpson G, trans.  Suicide: A Study in Sociology. Free Press; 1951.
11.
Won  HH, Myung  W, Song  GY,  et al.  Predicting national suicide numbers with social media data.   PLoS One. 2013;8(4):e61809. doi:10.1371/journal.pone.0061809 PubMedGoogle Scholar
12.
Jashinsky  J, Burton  SH, Hanson  CL,  et al.  Tracking suicide risk factors through Twitter in the US.   Crisis. 2014;35(1):51-59. doi:10.1027/0227-5910/a000234 PubMedGoogle ScholarCrossref
13.
Coppersmith GA, Harman CT, Dredze MH. Measuring post traumatic stress disorder in Twitter. International Conference on Weblogs and Social Media. Published 2014. Accessed January 31, 2020. https://www.qntfy.com/static/papers/ptsd_in_twitter.pdf
14.
O’Dea B, Wan S, Batterham PJ, Calear AL, Paris C, Christensen H.  Detecting suicidality on Twitter.   Internet Interventions. 2015;2(2):183-188. doi:10.1016/j.invent.2015.03.005Google ScholarCrossref
15.
Amir S, Dredze M, Ayers JW. Population level mental health surveillance over social media with digital cohorts. NAACL Workshop on Computational Linguistics and Clinical Psychology. Published 2019. Accessed January 31, 2020. https://pdfs.semanticscholar.org/0e40/f5c3f67912d78b3a38396c5692b7e5ba1220.pdf
16.
De Choudhury M, Gamon M, Counts S, Horvitz E. Predicting depression via social media. Seventh International AAAI Conference on Weblogs and Social Media. Published 2013. Accessed January 31, 2020. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/icwsm_13.pdf
17.
De Choudhury M, Counts S, Horvitz E. Social media as a measurement tool of depression in populations. In: Proceedings of the 5th Annual ACM Web Science Conference; 2013:47-56.
18.
De Choudhury M, Kiciman E, Dredze M, Coppersmith G, Kumar M. Discovering shifts to suicidal ideation from mental health content in social media. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 2016:2098-2110.
19.
Burnap P, Colombo W, Scourfield J. Machine classification and analysis of suicide-related communication on Twitter. In: Proceedings of the 26th ACM Conference on Hypertext and Social Media. 2015:75-84.
20.
Robinson  J, Cox  G, Bailey  E,  et al.  Social media and suicide prevention: a systematic review.   Early Interv Psychiatry. 2016;10(2):103-121. doi:10.1111/eip.12229 PubMedGoogle ScholarCrossref
21.
Reece  AG, Danforth  CM.  Instagram photos reveal predictive markers of depression.   EPJ Data Science. 2017;6(15). doi:10.1140/epjds/s13688-017-0110-zGoogle Scholar
22.
Coppersmith  G, Leary  R, Crutchley  P, Fine  A.  Natural language processing of social media as screening for suicide risk.   Biomed Inform Insights. 2018;10:1178222618792860. doi:10.1177/1178222618792860 PubMedGoogle Scholar
23.
Eichstaedt  JC, Smith  RJ, Merchant  RM,  et al.  Facebook language predicts depression in medical records.   Proc Natl Acad Sci U S A. 2018;115(44):11203-11208. doi:10.1073/pnas.1802331115 PubMedGoogle ScholarCrossref
24.
Birnbaum  ML, Ernala  SK, Rizvi  AF,  et al.  Detecting relapse in youth with psychotic disorders utilizing patient-generated and patient-contributed digital data from Facebook.   NPJ Schizophr. 2019;5(1):17. doi:10.1038/s41537-019-0085-9 PubMedGoogle ScholarCrossref
25.
Olteanu A, Castillo C, Diaz F, Kiciman E.  Social data: biases, methodological pitfalls, and ethical boundaries.   Frontiers in Big Data. 2019;2:13. doi:10.3389/fdata.2019.00013Google ScholarCrossref
26.
Ruths  D, Pfeffer  J.  Social sciences: social media for large studies of behavior.   Science. 2014;346(6213):1063-1064. doi:10.1126/science.346.6213.1063 PubMedGoogle ScholarCrossref
27.
Tufekci  Z. Big questions for social media big data: representativeness, validity and other methodological pitfalls. In: Proceedings of the 8th International AAAI Conference on Weblogs and Social Media; 2014.
28.
Danah Boyd and Kate Crawford.  Critical questions for big data: provocations for a cultural, technological, and scholarly phenomenon.   Inf Commun Society. 2012;15(5):662-679. doi:10.1080/1369118X.2012.678878Google ScholarCrossref
29.
Yang  AC, Tsai  SJ, Huang  NE, Peng  CK.  Association of internet search trends with suicide death in Taipei City, Taiwan, 2004-2009.   J Affect Disord. 2011;132(1-2):179-184. doi:10.1016/j.jad.2011.01.019 PubMedGoogle ScholarCrossref
30.
Arora  VS, Stuckler  D, McKee  M.  Tracking search engine queries for suicide in the United Kingdom, 2004-2013.   Public Health. 2016;137:147-153. doi:10.1016/j.puhe.2015.10.015 PubMedGoogle ScholarCrossref
31.
McCarthy  MJ.  Internet monitoring of suicide risk in the population.   J Affect Disord. 2010;122(3):277-279. doi:10.1016/j.jad.2009.08.015 PubMedGoogle ScholarCrossref
32.
Lazer  D, Kennedy  R, King  G, Vespignani  A.  Big data: the parable of Google flu: traps in big data analysis.   Science. 2014;343(6176):1203-1205. doi:10.1126/science.1248506 PubMedGoogle ScholarCrossref
33.
Olson  DR, Konty  KJ, Paladini  M, Viboud  C, Simonsen  L.  Reassessing Google Flu Trends data for detection of seasonal and pandemic influenza: a comparative epidemiological study at three geographic scales.   PLoS Comput Biol. 2013;9(10):e1003256. doi:10.1371/journal.pcbi.1003256 PubMedGoogle Scholar
34.
Lee  KS, Lee  H, Myung  W,  et al.  Advanced daily prediction model for national suicide numbers with social media data.   Psychiatry Investig. 2018;15(4):344-354. doi:10.30773/pi.2017.10.15 PubMedGoogle ScholarCrossref
35.
Kessler  RC, Borges  G, Walters  EE.  Prevalence of and risk factors for lifetime suicide attempts in the National Comorbidity Survey.   Arch Gen Psychiatry. 1999;56(7):617-626. doi:10.1001/archpsyc.56.7.617 PubMedGoogle ScholarCrossref
36.
Reeves  A, Stuckler  D, McKee  M, Gunnell  D, Chang  SS, Basu  S.  Increase in state suicide rates in the USA during economic recession.   Lancet. 2012;380(9856):1813-1814. doi:10.1016/S0140-6736(12)61910-2 PubMedGoogle ScholarCrossref
37.
Chou  WY, Hunt  YM, Beckjord  EB, Moser  RP, Hesse  BW.  Social media use in the United States: implications for health communication.   J Med internet Res. 2009;11(4):e48. doi:10.2196/jmir.1249 PubMedGoogle Scholar
38.
De Choudhury  M, Morris  MR, White  RW. Seeking and sharing health information online: comparing search engines and social media. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems; 2014:1365-1376.
39.
Zwald ML, Holland KM, Annor FB, et al.  Syndromic surveillance of suicidal ideation and self-directed violence—United States, January 2017–December 2018.   Morbidity and Mortality Weekly Report. 2020;69(4):103. doi:10.15585/mmwr.mm6904a3 PubMedGoogle ScholarCrossref
40.
National Syndromic Surveillance Program (NSSP). Sending early warning signals from emergency departments to public health. Reviewed August 12, 2020. Accessed September 4, 2019. https://www.cdc.gov/nssp/index.html
41.
National Suicide Prevention Lifeline. Published December 6, 2004. Accessed September 4, 2019. https://suicidepreventionlifeline.org/
42.
American Association of Poison Control Centers. National Poison Data System (NPDS). Accessed January 31, 2020. https://aapcc.org/national-poison-data-system
43.
Federal Reserve Bank of St. Louis. FRED: economic research. Published 2014. Accessed September 4, 2019. https://fred.stlouisfed.org/
44.
Sunrise and Sunset API. Published 1991. Accessed September 4, 2019. https://sunrise-sunset.org/api
45.
Chu  C-SJ.  Time series segmentation: a sliding window approach.   Inf Sci. 1995;85(1):147-173. doi:10.1016/0020-0255(95)00021-GGoogle ScholarCrossref
46.
Zou  H, Hastie  T.  Regularization and variable selection via the elastic net.   J R Stat Soc B. 2005;67:301-320. doi:10.1111/j.1467-9868.2005.00503.xGoogle ScholarCrossref
47.
Smola  AJ, Scholkopf  B.  A tutorial on support vector regression.   Stat Comput. 2004;14(3):199-222. doi:10.1023/B:STCO.0000035301.49549.88Google ScholarCrossref
48.
van der Laan  MJ, Polley  EC, Hubbard  AE. Super learner.  Stat Applic Genet Molecular Biol. Published online September 16, 2007. doi:10.2202/1544-6115.1309PubMed
49.
Dugas  AF, Jalalpour  M, Gel  Y,  et al.  Influenza forecasting with Google flu trends.   PLoS One. 2013;8(2):e56176. doi:10.1371/journal.pone.0056176PubMedGoogle Scholar
×