Trends and Focus of Machine Learning Applications for Health Research | Medical Journals and Publishing | JAMA Network Open | JAMA Network
[Skip to Navigation]
Access to paid content on this site is currently suspended due to excessive activity being detected from your IP address Please contact the publisher to request reinstatement.
Beam  AL, Kohane  IS.  Big data and machine learning in health care.  JAMA. 2018;319(13):1317-1318. doi:10.1001/jama.2017.18391PubMedGoogle ScholarCrossref
Ching  T, Himmelstein  DS, Beaulieu-Jones  BK,  et al.  Opportunities and obstacles for deep learning in biology and medicine.  J R Soc Interface. 2018;15(141):20170387. doi:10.1098/rsif.2017.0387PubMedGoogle Scholar
Naylor  CD.  On the prospects for a (deep) learning health care system.  JAMA. 2018;320(11):1099-1100. doi:10.1001/jama.2018.11103PubMedGoogle ScholarCrossref
Jha  S, Topol  EJ.  Adapting to artificial intelligence: radiologists and pathologists as information specialists.  JAMA. 2016;316(22):2353-2354. doi:10.1001/jama.2016.17438PubMedGoogle ScholarCrossref
Hinton  G.  Deep learning: a technology with the potential to transform health care.  JAMA. 2018;320(11):1101-1102. doi:10.1001/jama.2018.11100PubMedGoogle ScholarCrossref
Topol  EJ.  High-performance medicine: the convergence of human and artificial intelligence.  Nat Med. 2019;25(1):44-56. doi:10.1038/s41591-018-0300-7PubMedGoogle ScholarCrossref
Antropova  N, Beam  AL, Beaulieu-Jones  BK,  et al. Machine Learning for Health (ML4H) workshop at NeurIPS 2018. Preprint. Posted online November 17, 2018. arXiv 1811.07216.
Medical Imaging Meets NeurIPS. Medical Imaging Meets NeurIPS. Accessed September 2, 2019.
Medical question answering—textual inference and question entailment in the medical domain. ACL-BioNLP’19 Shared Task. Accessed September 2, 2019.
Machine learning for healthcare. Machine Learning for Healthcare website. Accessed September 2, 2019.
Machine learning for medicine and healthcare. Accessed September 2, 2019.
BIOKDD’19. Accessed September 2, 2019.
epiDAMIK: Epidemiology meets Data Mining and Knowledge Discovery. Accessed September 2, 2019.
O’Brien  BC, Harris  IB, Beckman  TJ, Reed  DA, Cook  DA.  Standards for reporting qualitative research: a synthesis of recommendations.  Acad Med. 2014;89(9):1245-1251. doi:10.1097/ACM.0000000000000388PubMedGoogle ScholarCrossref
Chivers  C. Topic analysis Github repository ML4H 2018. Github website. Accessed August 26, 2019.
Xing  EP, Jordan  MI, Russell  SJ, Ng  AY. Distance metric learning with application to clustering with side-information. In: Becker  S, Thrun  S, Obermayer  K, eds.  Advances in Neural Information Processing Systems. Vol 15. Boston, MA: MIT Press;2003:521-528.
Blei  DM, Ng  AY, Jordan  MI.  Latent Dirichlet allocation.  J Mach Learn Res. 2003;3:993-1022.Google Scholar
Lewis  DD, Yang  Y, Rose  TG, Li  F.  RCV1: a new benchmark collection for text categorization research.  J Mach Learn Res. 2004;5(April):361-397.Google Scholar
Sievert  C, Shirley  K. LDAvis: a method for visualizing and interpreting topics. In:  Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces. Baltimore, MD: Association of Computational Lingustics; 2014:63-70.
Goldberger  AL, Amaral  LAN, Glass  L,  et al.  PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals.  Circulation. 2000;101(23):E215-E220. doi:10.1161/01.CIR.101.23.e215PubMedGoogle ScholarCrossref
Johnson  AEW, Pollard  TJ, Shen  L, Lehman  LH.  MIMIC-III, a freely accessible critical care database.  Sci Data. 2016;3:160035. doi:10.1038/sdata.2016.35Google Scholar
Pollard  TJ, Johnson  AEW, Raffa  JD, Celi  LA, Mark  RG, Badawi  O.  The eICU Collaborative Research Database, a freely available multi-center database for critical care research.  Sci Data. 2018;5:180178. doi:10.1038/sdata.2018.178PubMedGoogle Scholar
Machine Learning for Health—2018 interactive topic modeling. ML4H—2018 Topic Modelling. Accessed September 2, 2019.
Mc Dermott  MBA, Wang  S. Reproducibility in machine learning for health. Accessed September 20, 2019.
Kim  DW, Jang  HY, Kim  KW, Shin  Y, Park  SH.  Design characteristics of studies reporting the performance of artificial intelligence algorithms for diagnostic analysis of medical images: results from recently published papers.  Korean J Radiol. 2019;20(3):405-410. doi:10.3348/kjr.2019.0025PubMedGoogle ScholarCrossref
Mirowski  P, Madhavan  D, LeCun  Y, Kuzniecky  R.  Classification of patterns of EEG synchronization for seizure prediction.  Clin Neurophysiol. 2009;120(11):1927-1940. doi:10.1016/j.clinph.2009.09.002PubMedGoogle ScholarCrossref
Spechbach  H, Morel  P, Ing Lorenzini  K,  et al.  Reversible ventricular arrythmia induced by dasatinib.  Clin Case Rep. 2013;1(1):20-25. doi:10.1002/ccr3.5PubMedGoogle ScholarCrossref
Guo  Y, Liu  Y, Oerlemans  A, Lao  S, Wu  S, Lew  MS.  Deep learning for visual understanding: a review.  Neurocomputing. 2016;187:27-48. doi:10.1016/j.neucom.2015.09.116Google ScholarCrossref
Abadi  M, Barham  P, Chen  J,  et al. Tensorflow: a system for large-scale machine learning. In:  Proceedings of the 12th Usenix Symposium on Operating Systems Design and Implementation. Berkeley, CA: USENIX; 2016:265-283.
Paszke  A, Gross  S, Chintala  S,  et al. Automatic differentiation in PyTorch. Published October 2017. Accessed August 30, 2019.
Pedregosa  F, Varoquaux  G, Gramfort  A. Scikit-learn: machine learning in Python. 2011. Accessed September 20, 2019.
Liang  H, Tsui  BY, Ni  H,  et al.  Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence.  Nat Med. 2019;25(3):433-438. doi:10.1038/s41591-018-0335-9PubMedGoogle ScholarCrossref
Fernández-Ruiz  I.  Artificial intelligence to improve the diagnosis of cardiovascular diseases.  Nat Rev Cardiol. 2019;16(3):133. doi:10.1038/s41569-019-0158-5PubMedGoogle ScholarCrossref
Limit 200 characters
Limit 25 characters
Conflicts of Interest Disclosure

Identify all potential conflicts of interest that might be relevant to your comment.

Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.

Err on the side of full disclosure.

If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.

Not all submitted comments are published. Please see our commenting policy for details.

Limit 140 characters
Limit 3600 characters or approximately 600 words
    Original Investigation
    Health Informatics
    October 25, 2019

    Trends and Focus of Machine Learning Applications for Health Research

    Author Affiliations
    • 1Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
    • 2Harvard Medical School, Boston, Massachusetts
    • 3Predictive Health Care Group, University of Pennsylvania Health System, Philadelphia
    • 4MIT Computer Science and Artificial Intelligence Lab, Boston, Massachusetts
    • 5Department of Medicine, Imperial College London, London, United Kingdom
    • 6Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
    • 7College of Information and Computer Sciences, University of Massachusetts, Amherst
    • 8Microsoft Research, Redmond, Washington
    JAMA Netw Open. 2019;2(10):e1914051. doi:10.1001/jamanetworkopen.2019.14051
    Key Points español 中文 (chinese)

    Question  What topics are researchers in machine learning focused on and what methods and data sets do they use?

    Findings  This qualitative analysis of 166 accepted manuscript submissions to the Third Annual Machine Learning for Health workshop at the 32nd Conference on Neural Information Processing Systems found that easy-to-access, well-annotated data increased machine learning research within specific health domains (58.4% of submissions). Clinicians were involved in a small amount of machine learning for health (34.9% of submissions).

    Meaning  This analysis suggests that the interdisciplinary field of machine learning for health may be accelerated by easy-to-access, well-annotated data and would benefit from greater clinician involvement to develop into translational applications.


    Importance  The use of machine learning applications related to health is rapidly increasing and may have the potential to profoundly affect the field of health care.

    Objective  To analyze submissions to a popular machine learning for health venue to assess the current state of research, including areas of methodologic and clinical focus, limitations, and underexplored areas.

    Design, Setting, and Participants  In this data-driven qualitative analysis, 166 accepted manuscript submissions to the Third Annual Machine Learning for Health workshop at the 32nd Conference on Neural Information Processing Systems on December 8, 2018, were analyzed to understand research focus, progress, and trends. Experts reviewed each submission against a rubric to identify key data points, statistical modeling and analysis of submitting authors was performed, and research topics were quantitatively modeled. Finally, an iterative discussion of topics common in submissions and invited speakers at the workshop was held to identify key trends.

    Main Outcomes and Measures  Frequency and statistical measures of methods, topics, goals, and author attributes were derived from an expert review of submissions guided by a rubric.

    Results  Of the 166 accepted submissions, 58 (34.9%) had clinician involvement and 83 submissions (50.0%) that focused on clinical practice included clinical collaborators. A total of 97 data sets (58.4%) used in submissions were publicly available or required a standard registration process. Clinical practice was the most common application area (70 manuscripts [42.2%]), with brain and mental health (25 [15.1%]), oncology (21 [12.7%]), and cardiovascular (19 [11.4%]) being the most common specialties.

    Conclusions and Relevance  Trends in machine learning for health research indicate the importance of well-annotated, easily accessed data and the benefit from greater clinician involvement in the development of translational applications.