[Skip to Navigation]
March 20, 2023

Machine Learning and Statistics in Clinical Research Articles—Moving Past the False Dichotomy

Author Affiliations
  • 1Department of Pediatrics, Seattle Children’s Hospital, Seattle, Washington
  • 2Department of Genetics, University of Washington, Seattle
  • 3Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
  • 4Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
JAMA Pediatr. 2023;177(5):448-450. doi:10.1001/jamapediatrics.2023.0034

Medical artificial intelligence (AI) and machine learning have progressed rapidly over the past decade, yielding many new products that clinicians must increasingly learn to integrate into clinical practice.1 A common question is, how do AI and machine learning relate to more familiar work from medical statistics?

In the summer of 1956, a group of computer scientists gathered at Dartmouth for a 2-month workshop to discuss what organizer John McCarthy termed artificial intelligence: “the science and engineering of making intelligent machines.”2 From the outset, AI attracted researchers from diverse backgrounds including neuroscience, telecommunications, and formal logic. The field was defined not by any specific methodologic approach but rather by the shared goal of enabling computers to solve new tasks.3 Machine learning is the subfield involving a data-driven approach to AI and received its name from Dartmouth workshop attendee Arthur Samuel, who is credited as coining machine learning while discussing his work at IBM building a computer that plays checkers.4 The core premise of machine learning is that a feasible path toward an intelligent computer is to build a learning computer—a machine that improves from experience and exposure to data.

Add or change institution
1 Comment for this article
Bridging the divide
A Wilson, PhD, MStat | University of Utah and Parexel
This article perfectly neutralizes the criticism that statistics and machine learning are "non-overlapping magistrates" and should be treated as somehow categorically different. As the authors describe, moving beyond this false dichotomy is essential to brokering a peace agreement among the methods. But this bridge is also necessary to move forward with an emerging methodology crucial to progress in modern medical research - causal inference. Causal inference focuses on establishing causal relationships between variables by using a combination of statistical methods and domain knowledge. It has greatly benefited from integrating machine learning techniques, allowing for more complex modeling of causal relationships, handling high-dimensional data, and addressing issues such as confounding and endogeneity. Causal inference has become increasingly important in medical AI and machine learning, as it helps understand the underlying mechanisms of observed associations and make informed decisions based on causal effects.

So, more than bridging the divide, we may need to encourage a marriage between machine learning and statistical inference to move forward in a meaningful way.
CONFLICT OF INTEREST: I work for Parexel, a large CRO. However, I have no financial interests that would present a conflict.