Development and Assessment of a Machine Learning Model to Help Predict Survival Among Patients With Oral Squamous Cell Carcinoma | Head and Neck Cancer | JAMA Otolaryngology–Head & Neck Surgery | JAMA Network
[Skip to Content]
[Skip to Content Landing]
Table 1.  Demographic Characteristics of Patients
Demographic Characteristics of Patients
Table 2.  Permutation Feature Scores
Permutation Feature Scores
1.
Funk  GF, Karnell  LH, Robinson  RA, Zhen  WK, Trask  DK, Hoffman  HT.  Presentation, treatment, and outcome of oral cavity cancer: a National Cancer Data Base report.  Head Neck. 2002;24(2):165-180. doi:10.1002/hed.10004PubMedGoogle ScholarCrossref
2.
Rivera  C.  Essentials of oral cancer.  Int J Clin Exp Pathol. 2015;8(9):11884-11894.PubMedGoogle Scholar
3.
Torrecillas  V, Shepherd  HM, Francis  S,  et al.  Adjuvant radiation for T1-2N1 oral cavity cancer survival outcomes and utilization treatment trends: analysis of the SEER database.  Oral Oncol. 2018;85:1-7. doi:10.1016/j.oraloncology.2018.07.019PubMedGoogle ScholarCrossref
4.
Camisasca  DR, Honorato  J, Bernardo  V,  et al.  Expression of Bcl-2 family proteins and associated clinicopathologic factors predict survival outcome in patients with oral squamous cell carcinoma.  Oral Oncol. 2009;45(3):225-233. doi:10.1016/j.oraloncology.2008.05.021PubMedGoogle ScholarCrossref
5.
Girod  A, Mosseri  V, Jouffroy  T, Point  D, Rodriguez  J.  Women and squamous cell carcinomas of the oral cavity and oropharynx: is there something new?  J Oral Maxillofac Surg. 2009;67(9):1914-1920. doi:10.1016/j.joms.2009.04.031PubMedGoogle ScholarCrossref
6.
Thiagarajan  S, Nair  S, Nair  D,  et al.  Predictors of prognosis for squamous cell carcinoma of oral tongue.  J Surg Oncol. 2014;109(7):639-644. doi:10.1002/jso.23583PubMedGoogle ScholarCrossref
7.
Liu  F, Chen  F, Huang  J,  et al.  Prospective study on factors affecting the prognosis of oral cancer in a Chinese population.  Oncotarget. 2017;8(3):4352-4359. doi:10.18632/oncotarget.13842PubMedGoogle Scholar
8.
Jadhav  KB, Gupta  N.  Clinicopathological prognostic implicators of oral squamous cell carcinoma: need to understand and revise.  N Am J Med Sci. 2013;5(12):671-679. doi:10.4103/1947-2714.123239PubMedGoogle ScholarCrossref
9.
Bur  AM, Lin  A, Weinstein  GS.  Adjuvant radiotherapy for early head and neck squamous cell carcinoma with perineural invasion: a systematic review.  Head Neck. 2016;38(suppl 1):E2350-E2357. doi:10.1002/hed.24295PubMedGoogle ScholarCrossref
10.
Steyerberg  EW, Vergouwe  Y.  Towards better clinical prediction models: seven steps for development and an ABCD for validation.  Eur Heart J. 2014;35(29):1925-1931. doi:10.1093/eurheartj/ehu207PubMedGoogle ScholarCrossref
11.
Moons  KG, Altman  DG, Vergouwe  Y, Royston  P.  Prognosis and prognostic research: application and impact of prognostic models in clinical practice.  BMJ. 2009;338:b606. doi:10.1136/bmj.b606PubMedGoogle ScholarCrossref
12.
Piccirillo  JF.  Purposes, problems, and proposals for progress in cancer staging.  Arch Otolaryngol Head Neck Surg. 1995;121(2):145-149. doi:10.1001/archotol.1995.01890020009003PubMedGoogle ScholarCrossref
13.
Kourou  K, Exarchos  TP, Exarchos  KP, Karamouzis  MV, Fotiadis  DI.  Machine learning applications in cancer prognosis and prediction.  Comput Struct Biotechnol J. 2014;13:8-17. doi:10.1016/j.csbj.2014.11.005PubMedGoogle ScholarCrossref
14.
Bur  AM, Shew  M, New  J.  Artificial intelligence for the otolaryngologist: a state of the art review.  Otolaryngol Head Neck Surg. 2019;160(4):603-611. doi:10.1177/0194599819827507PubMedGoogle ScholarCrossref
15.
Cruz  JA, Wishart  DS.  Applications of machine learning in cancer prediction and prognosis.  Cancer Inform. 2007;2:59-77.PubMedGoogle Scholar
16.
Thrall  JH, Li  X, Li  Q,  et al.  Artificial intelligence and machine learning in radiology: opportunities, challenges, pitfalls, and criteria for success.  J Am Coll Radiol. 2018;15(3, pt B):504-508. doi:10.1016/j.jacr.2017.12.026PubMedGoogle ScholarCrossref
17.
Gupta  S, Tran  T, Luo  W,  et al.  Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry.  BMJ Open. 2014;4(3):e004007. doi:10.1136/bmjopen-2013-004007PubMedGoogle ScholarCrossref
18.
McCarthy  JF, Marx  KA, Hoffman  PE,  et al.  Applications of machine learning and high-dimensional visualization in cancer detection, diagnosis, and management.  Ann N Y Acad Sci. 2004;1020:239-262. doi:10.1196/annals.1310.020PubMedGoogle ScholarCrossref
19.
Obermeyer  Z, Emanuel  EJ.  Predicting the future—big data, machine learning, and clinical medicine.  N Engl J Med. 2016;375(13):1216-1219. doi:10.1056/NEJMp1606181PubMedGoogle ScholarCrossref
20.
Shew  M, New  J, Bur  AM.  Machine learning to predict delays in adjuvant radiation following surgery for head and neck cancer  [published online January 29, 2019].  Otolaryngol Head Neck Surg. doi:10.1177/0194599818823200PubMedGoogle Scholar
21.
Shah  ND, Steyerberg  EW, Kent  DM.  Big data and predictive analytics: recalibrating expectations.  JAMA. 2018;320(1):27-28. doi:10.1001/jama.2018.5602PubMedGoogle ScholarCrossref
22.
Hinton  G.  Deep learning—a technology with the potential to transform health care.  JAMA. 2018;320(11):1101-1102. doi:10.1001/jama.2018.11100PubMedGoogle ScholarCrossref
23.
Bilimoria  KY, Stewart  AK, Winchester  DP, Ko  CY.  The National Cancer Data Base: a powerful initiative to improve cancer care in the United States.  Ann Surg Oncol. 2008;15(3):683-690. doi:10.1245/s10434-007-9747-3PubMedGoogle ScholarCrossref
24.
Bilimoria  KY, Bentrem  DJ, Stewart  AK, Winchester  DP, Ko  CY.  Comparison of Commission on Cancer–approved and –nonapproved hospitals in the United States: implications for studies that use the National Cancer Data Base.  J Clin Oncol. 2009;27(25):4177-4181. doi:10.1200/JCO.2008.21.7018PubMedGoogle ScholarCrossref
25.
Deyo  RA, Cherkin  DC, Ciol  MA.  Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases.  J Clin Epidemiol. 1992;45(6):613-619. doi:10.1016/0895-4356(92)90133-8PubMedGoogle ScholarCrossref
26.
Graham  JW.  Missing data analysis: making it work in the real world.  Annu Rev Psychol. 2009;60:549-576. doi:10.1146/annurev.psych.58.110405.085530PubMedGoogle ScholarCrossref
27.
Pedersen  AB, Mikkelsen  EM, Cronin-Fenton  D,  et al.  Missing data and multiple imputation in clinical epidemiological research.  Clin Epidemiol. 2017;9:157-166. doi:10.2147/CLEP.S129785PubMedGoogle ScholarCrossref
28.
Karadaghy  OA, Kallogjeri  D, Piccirillo  JF.  Development of a new clinical severity staging system for patients with nonmetastatic papillary thyroid carcinoma.  JAMA Otolaryngol Head Neck Surg. 2017;143(12):1173-1180. doi:10.1001/jamaoto.2017.0550PubMedGoogle ScholarCrossref
29.
Feinstein  AR, Wells  CK.  A clinical-severity staging system for patients with lung cancer.  Medicine (Baltimore). 1990;69(1):1-33. doi:10.1097/00005792-199001000-00001PubMedGoogle ScholarCrossref
30.
Piccirillo  JF.  Importance of comorbidity in head and neck cancer.  Laryngoscope. 2000;110(4):593-602. doi:10.1097/00005537-200004000-00011PubMedGoogle ScholarCrossref
31.
Piccirillo  JF, Feinstein  AR.  Clinical symptoms and comorbidity: significance for the prognostic classification of cancer.  Cancer. 1996;77(5):834-842. doi:10.1002/(SICI)1097-0142(19960301)77:5<834::AID-CNCR5>3.0.CO;2-EPubMedGoogle ScholarCrossref
32.
Wang  G, Lam  KM, Deng  Z, Choi  KS.  Prediction of mortality after radical cystectomy for bladder cancer by machine learning techniques.  Comput Biol Med. 2015;63:124-132. doi:10.1016/j.compbiomed.2015.05.015PubMedGoogle ScholarCrossref
33.
Kallogjeri  D, Piccirillo  JF, Spitznagel  EL  Jr, Steyerberg  EW.  Comparison of scoring methods for ACE-27: simpler is better.  J Geriatr Oncol. 2012;3(3):238-245. doi:10.1016/j.jgo.2012.01.006PubMedGoogle ScholarCrossref
34.
Kallogjeri  D, Gaynor  SM, Piccirillo  ML, Jean  RA, Spitznagel  EL  Jr, Piccirillo  JF.  Comparison of comorbidity collection methods.  J Am Coll Surg. 2014;219(2):245-255. doi:10.1016/j.jamcollsurg.2014.01.059PubMedGoogle ScholarCrossref
35.
Yu  MK, Ma  J, Fisher  J, Kreisberg  JF, Raphael  BJ, Ideker  T.  Visible machine learning for biomedicine.  Cell. 2018;173(7):1562-1565. doi:10.1016/j.cell.2018.05.056PubMedGoogle ScholarCrossref
36.
Bur  AM, Holcomb  A, Goodwin  S,  et al.  Machine learning to predict occult nodal metastasis in early oral squamous cell carcinoma.  Oral Oncol. 2019;92:20-25. doi:10.1016/j.oraloncology.2019.03.011Google ScholarCrossref
Original Investigation
May 2, 2019

Development and Assessment of a Machine Learning Model to Help Predict Survival Among Patients With Oral Squamous Cell Carcinoma

Author Affiliations
  • 1Department of Otolaryngology–Head and Neck Surgery, University of Kansas Medical Center, Kansas City
  • 2University of Kansas Medical Center, School of Medicine, Kansas City
JAMA Otolaryngol Head Neck Surg. 2019;145(12):1115-1120. doi:10.1001/jamaoto.2019.0981
Key Points

Question  How can machine learning be used to further our ability to create prediction models for survival of oral cancer?

Findings  In this cohort study of more than 30 000 patients, a prediction model using a variety of patients, tumors, treatment facilities, and treatment types predicted 5-year overall survival with an accuracy of 71%, precision of 71%, and recall of 68%.

Meaning  Novel machine learning forms of analysis may help in the creation of prediction models using large data registries, and the inclusion of several aspects of the health status of a patient with cancer can more accurately predict survival.

Abstract

Importance  Predicting survival of oral squamous cell carcinoma through the use of prediction modeling has been underused, and the development of prediction models would augment clinicians’ ability to provide absolute risk estimates for individual patients.

Objectives  To develop a prediction model using machine learning for 5-year overall survival among patients with oral squamous cell carcinoma and compare this model with a prediction model created from the TNM (Tumor, Node, Metastasis) clinical and pathologic stage.

Design, Setting, and Participants  A retrospective cohort study was conducted of 33 065 patients with oral squamous cell carcinoma from the National Cancer Data Base between January 1, 2004, and December 31, 2011. Patients were excluded if the treatment was considered palliative, staging demonstrated T0 or Tis, or survival or staging data were missing. Patient, tumor, treatment, and outcome information were obtained from the National Cancer Data Base. The data were split into a distribution of 80% for training and 20% for testing. The model was created using 2-class decision forest architecture. Permutation feature importance scores were used to determine the variables that were used in the model’s prediction and their order of significance. Statistical analysis was conducted from August 1, 2018, to January 10, 2019.

Main Outcomes and Measures  Ability to predict 5-year overall survival assessed through area under the curve, accuracy, precision, and recall.

Results  Among the 33 065 patients in the study, the mean (SD) age was 64.6 (14.0) years, 19 791 were men (59.9%), 13 274 were women (40.1%), and 29 783 (90.1%) were white. At 60 months, there were 16 745 deaths (50.6%). The median time of follow-up was 56.8 months (range, 0-155.6 months). Age, pathologic T stage, positive margins at the time of surgery, lymph node size, and institutional identification were identified among the most significant variables. The calculated area under the curve for this machine learning model was 0.80 (95% CI, 0.79-0.81), accuracy was 71%, precision was 71%, and recall was 68%. In comparison, the calculated area under the curve of the TNM staging system was 0.68 (95% CI, 0.67-0.70), accuracy was 65%, precision was 69%, and recall was 52%.

Conclusions and Relevance  Using machine learning algorithms, a prediction model was created based on patient social, demographic, clinical, and pathologic features. The developed prediction model proved to be better than a prediction model that exclusively used TNM pathologic and clinical stage according to all performance metrics. This study highlights the role that machine learning may play in individual patient risk estimation in the era of big data.

Introduction

Oral squamous cell carcinoma (OSCC) is a significant health problem globally.1,2 Prognostic indicators for OSCC have been investigated, and nodal metastasis has frequently been found to be the most significant prognostic indicator.3,4 Other investigations have suggested that female sex, metastasis, advanced age, extracapsular spread, perineural invasion, and tobacco use are additional significant clinicopathologic variables associated with survival.3,5-9 Despite the identification of prognostically significant variables, absolute risk estimates for individual patients remain understudied and are not commonly used in counseling patients.10,11

Development of clinical prediction models is crucial to enhance a clinician’s ability to provide absolute risk estimates.10,11 Clinicians rely on the Tumor, Node, Metastasis (TNM) classification to convey the gravity of cancer diagnosis and prognosis.12 With the increasing availability of large national databases and computing power, the amount of data input has increased, allowing for novel approaches to data analysis.13 One such method includes machine learning, a subdiscipline of artificial intelligence that helps researchers analyze large amounts of data to find patterns to better solve problems by making predictions.14 The use of machine learning in health care has been rapidly growing during the past decade.15,16 In clinical research, machine learning is being used to enhance current prediction modeling by providing more accurate and precise predictions for outcomes of interest.13,17-20 As the availability of patient, pathologic, and genetic information increases, machine learning may prove to be a novel tool for predicting survival.21,22

The goal of this study was to use machine learning to develop a model that predicts 5-year overall survival among patients with OSCC. With machine learning, we anticipate the ability to incorporate multiple clinical and pathologic variables to create an improved and personalized prediction model to better counsel patients.

Methods
Data Source

The National Cancer Data Base (NCDB) is a jointly sponsored initiative led by the American College of Surgeons and the American Cancer Society as a clinical oncology database sourced from hospital registry data. All data are collected from more than 1500 Commission on Cancer–accredited facilities and captures more than 70% of newly diagnosed cancers nationwide.23,24 The study was granted exemption by the Kansas University Medical Center Institutional Review Board because the database is publicly available to participating sites and all patient information is deidentified.

Study Population

Patients with OSCC were identified in the NCDB from January 1, 2004, to December 31, 2011, to allow a sufficient follow-up time of 5 years. The included anatomical subsites of the oral cavity were oral lip, floor of mouth, gum of mouth, anterior two-thirds of tongue, hard palate, and buccal mucosa. Exclusion criteria included patients with tumor stage T0 and Tis and patients who underwent palliative treatment or if critical variable information, such as survival data and staging data, were missing.

Covariates of Interest

Variables studied are largely divided into 4 separate categories: patient, facility, tumor, and treatment characteristics. Patient characteristics include age, sex, race/ethnicity, and comorbid disease as calculated by the Charlson-Deyo score.25 Insurance status, educational level, median household income, rural or urban residence, and distance from the hospital were also collected and included in analysis. Facility characteristics explored include whether the treating hospital is a community program, academic program, integrated network cancer program, or other. The NCDB divides the locations into New England, Middle Atlantic, South Atlantic, East North Central, East South Central, West North Central, West South Central, Mountain, and Pacific. Tumor characteristics collected include the T, N, and M score and stage group determined both clinically and pathologically. The TNM edition number is taken into consideration. Additional analyzed tumor variables include tumor grade, extracapsular spread, and perineural invasion. Information regarding metastatic disease is dispersed through multiple variables; therefore, a single new variable encompassing the overlapping variable is created. Treatment characteristics are limited to primary course information with either surgical or radiotherapy treatment modalities. Although several aspects of treatment information are obtained by the NCDB, the decision to limit the inclusion of all available treatment characteristics allows greater clinical utility of the prediction model. The primary study end point is 5-year overall survival, which was calculated using the vital status variable and date of last contact.

Missing Data

Missing data for covariates of interest were explored and categorized into the categories missing completely at random, missing at random, and missing not at random.26,27 Variables determined to be missing at random were handled using single-value imputation of median values. No data imputation was used for variables determined to be missing not at random. No data imputation was used for variables with missing information greater than 40%.

Machine Learning Prediction Modeling

The construction of a supervised, machine learning classification model was achieved using open source Azure Machine Learning Studio (Microsoft Corp). We randomly split the data into a distribution of 80% of data as a training set and the remaining 20% as a test set. We considered multiple 2-class decision models, including decision forest, decision jungle, logistic regression, and neural network. For optimization of the models, a parameter range was established that would allow the machine learning model to calculate the best parameters through multiple combinations. Bootstrap aggregation was built into model generation for each of the fitted models. For model construction, we created 2 separate experiments: the first included all variables for model creation and the second included only the pathologic TNM stage, or clinical stage if pathologic stage was unavailable, for model creation. The outcome of interest, 5-year overall survival, was identified as the label.

Performance and Validation of the Prediction Model

After the models were built, they were scored and evaluated using the test set data. The performance of the model was measured using area under the curve (AUC), accuracy, precision, and recall in accordance with previous recommendations for results reporting of clinical prediction models.10 The permutation feature importance scores obtained from the model using the test data provided insight as to the most significant variables used in the model’s prediction. The permutation feature importance scores are determined as the difference in model performance determined by the AUC before and after alteration of a given dependent variable. This process is repeated for each variable included in the model. Thus, the absolute magnitude of a permutation feature importance score reveals a feature with great effect on model performance. Finally, a prediction model using only pathologic and clinical TNM stage was created to compare the performance of the 2 models.

Statistical Analysis

Statistical analysis was conducted from August 1, 2018, to January 10, 2019. Preparation of the data was achieved using SPSS, version 25 (IBM Corp), and the analysis was conducted using both SPSS and open source Azure Machine Learning Studio. Basic descriptive statistics were used.

Results
Study Population Characteristics and Survival Information

A total of 38 477 patients were eligible participants for this study. A total of 33 065 patients were included for analysis after excluding 4917 patients because of missing survival data and 495 patients because of missing staging information. The mean (SD) age of participants was 64.6 (14.0) years. Men comprised 59.9% of the cohort (n = 19 791). A total of 90.1% of the patients were white (n = 29 783). Full patient demographics are summarized in Table 1. Median follow-up time was 56.8 months (range, 0-155.6 months). At 60 months, 16 745 patients (50.6%) died, resulting in a 49.4% overall survival rate at 5 years.

Prediction Model Development and Validation

For model development, 80.0% (n = 26 452) of the data were randomly selected and used. The classification models explored included 2-class decision forest, 2-class decision jungle, 2-class logistic regression, and 2-class neural network. For the development of the first clinical prediction model, all variables available in the NCDB were made available for use to the machine learning model. On completion, the prediction model was applied to the test data set where performance metrics were measured. The decision forest classification was the most robust, with an AUC of 0.80 (95% CI, 0.79-0.81), accuracy of 71%, precision of 71%, and recall of 68%. In comparison, the 2-class decision forest machine learning model using only pathologic and clinical TNM staging was less accurate, with an AUC of 0.68 (95% CI, 0.67-0.70), an accuracy of 65%, precision of 69%, and recall of 52%.

Prediction Performance

The permutation feature of importance allows insight into how the machine learning model weights different factors in creating its algorithm. The results are displayed in Table 2. The most important features are displayed in ascending order, along with their corresponding importance score. In the creation of the clinical prediction model, the most important variable was patient age, followed closely by several clinical and pathologic variables including pathologic T stage, insurance status, lymph node size, institutional identification, positive margins at time of surgery, and more. The ideal parameter determined by the model was a minimum of 1 sample per leaf node, 1024 random splits per node, maximum of 64 for depth of the decision tree, and limitation to 32 different decision trees.

Discussion

In this study, a model predicting 5-year overall survival among patients diagnosed with OSCC was constructed using machine learning. To our knowledge, this is one of the largest studies using machine learning to examine survival among patients with head and neck cancer. We demonstrate that machine learning offers a novel solution to improve and personalize patient care through the development of absolute risk estimates for individual patients. By incorporating multiple factors that go beyond TNM staging, we also found improved accuracy, precision, and recall ability in predicting overall survival among patients with OSCC.

The use of TNM characteristics has been a cornerstone of clinical practice given their simplicity and relative accuracy; however, the use of TNM characteristics can be improved by incorporating both clinical and pathologic variables. In an era in which vast amounts of electronic patient health data are available, machine learning could be incorporated into electronic health records to provide clinicians with valuable evidence-based prognostic information. In this study, the top variables used by the machine learning model to predict survival were largely a combination of demographic, pathologic, and treatment variables. This finding signifies the importance of incorporating variables that can more holistically describe patients in future efforts to improve prediction or prognostic modeling. This finding is consistent with several studies that have evaluated the clinical importance of demographic, socioeconomic, and tumor-specific factors for patients with head and neck cancer.5,6,28-31

The rapid growth of large registries that capture several dimensions of health data has been beneficial in the attempts to further improve prediction modeling. However, strict adherence to more traditional statistical methods, such as Cox proportional hazards regression, logistic regression, and Kaplan-Meier estimates, may slow the progress of prediction models. This possibility is demonstrated by the inability of the aforementioned methods to handle medical data with high variability, nonlinear interactions, and heterogeneous distributions.15,32 Machine learning techniques may be a more suitable form of analysis in this context, as they have been demonstrated to handle large data sets with complex, nonlinear, heterogeneous distributions.15,18,32 The unique characteristic of machine learning that allows for this advantage is its ability to apply Boolean logic, absolute conditionality, conditional probabilities, and other unconventional strategies to model data while still drawing heavily for statistics and probabilities.15

Limitations

Machine learning is not without its drawbacks. Data that are incorrectly or poorly classified will affect the quality of the model.21 Improvements are constantly being made in our ability to capture patient health information to more comprehensibly or efficiently assess its effect. One such example includes the method by which cancer registries capture comorbidity. The NCDB uses the Deyo adaption of the Charlson Comorbidity Index to quantify the extent of comorbidity burden.25 More novel tools that better capture comorbidity have been developed, such as the Adult Comorbidity Evaluation-27,33 and the use of better clinical assessment tools leads to more accurate predictions.34 Similar to traditional statistics, modeling data with too few events relative to predictive variables is also a quality that needs to be appreciated.21 Another limitation to machine learning is the lack of transparency in the analysis. Machine learning involves multiple layers of analysis to make a meaningful prediction,13 and these layers often cannot be meaningfully interpreted.35

In this study, there were several limitations to be considered. Primarily, a common issue in our data was inconsistency or inaccuracy of reporting. Specifically, there were several instances in which 2 or more variables measuring related aspects of each patient had seemingly contradictory information. This inconsistency occurred in several instances such as descriptions of tumor or lymph node size, presence of metastatic disease, treatment type, or more. Furthermore, in the data set, several variables had missing data in more than 50% of cases, which reduced our ability to assess the potential effect of such variables on outcomes. Examples of these variables included lymphovascular invasion and human papillomavirus status. These missing data limit our ability to assess the potential effect these variables would have in prediction of survival. Future efforts to ensure capture of such variables are needed to further our understanding of its potential effect.

In addition, use of machine learning to predict overall survival among patients with cancer raises several ethical questions that must be answered prior to widespread implementation.14 First, will there be unintended consequences of accurate prognostication about individuals’ survival? Are clinicians and, in turn, patients less likely to pursue curative treatment in the face of a low prognosis of survival? Who is responsible when predictive models are incorrect: the clinician who uses the model or the developer of the predictive model? Would patients even want to know their likelihood of being alive or dead in 5 years? Many important ethical questions remain unanswered about the increasing role of automated algorithms in health care.

There are numerous ways in which machine learning algorithms could be used in a real-world clinical setting. First, an interface to the machine learning model could be published on a website. This feature is supported by the Azure Machine Learning Studio and would allow clinicians to input individual patient data into a web-based form. Based on the decision forest created in this analysis, 5-year overall survival would be predicted. Among the ways in which machine learning algorithms could be used in a real-world clinical setting, probably the most effective and most difficult to achieve would be direct integration of machine learning into the electronic medical record (EMR).36

A model developed using machine learning presents interesting and novel challenges to its incorporation in clinical practice. Key obstacles to the widespread adoption of this type of algorithm include convenience, regulation, and financial considerations. Multivariable models can be burdensome to use and are unlikely to gain popularity if they require manual entry of patient data. Multivariable models developed using machine learning differ in that artificial intelligence may allow automatic extrapolation of variables from the EMR. If included in the EMR, a new feature allowing artificial intelligence to interface with patient records may be the most convenient way to incorporate machine learning algorithms into clinical practice. This scenario would present significant regulatory challenges to ensure the compliance of machine learning with the Health Insurance Portability and Accountability Act. Furthermore, the incorporation of machine learning into different EMR products would require a clear financial benefit to EMR developers, who may currently be disincentivized from adopting these technologies.36

Conclusions

Using a machine learning algorithm, a survival prediction model was created implementing a variety of patient, clinical, tumor, facility, and treatment variables as collected by the NCDB. The created prediction model was compared with a prediction model that used only clinical and pathologic TNM stage and was found to better predict overall survival. This study highlights the importance of a more holistic approach of describing patients and how, in the era of large databases, machine learning and artificial intelligence may stand to play a significant role in improving health care by furthering our ability to quantify individual patient risk estimates.

Back to top
Article Information

Accepted for Publication: April 1, 2019.

Corresponding Author: Omar A. Karadaghy, MD, Department of Otolaryngology–Head and Neck Surgery, University of Kansas Medical Center, 3901 Rainbow Blvd, Mail Stop 3010, Kansas City, KS 66160 (omar.karadaghy@gmail.com).

Published Online: May 2, 2019. doi:10.1001/jamaoto.2019.0981

Author Contributions: Drs Karadaghy and Bur had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Concept and design: All authors.

Acquisition, analysis, or interpretation of data: All authors.

Drafting of the manuscript: Karadaghy, Shew, Bur.

Critical revision of the manuscript for important intellectual content: All authors.

Statistical analysis: Karadaghy, Shew, New.

Administrative, technical, or material support: Karadaghy, New.

Supervision: Shew, Bur.

Conflict of Interest Disclosures: None reported.

Meeting Presentation: This paper was presented at the Annual Meeting of the American Head & Neck Society; May 2, 2019; Austin, Texas.

References
1.
Funk  GF, Karnell  LH, Robinson  RA, Zhen  WK, Trask  DK, Hoffman  HT.  Presentation, treatment, and outcome of oral cavity cancer: a National Cancer Data Base report.  Head Neck. 2002;24(2):165-180. doi:10.1002/hed.10004PubMedGoogle ScholarCrossref
2.
Rivera  C.  Essentials of oral cancer.  Int J Clin Exp Pathol. 2015;8(9):11884-11894.PubMedGoogle Scholar
3.
Torrecillas  V, Shepherd  HM, Francis  S,  et al.  Adjuvant radiation for T1-2N1 oral cavity cancer survival outcomes and utilization treatment trends: analysis of the SEER database.  Oral Oncol. 2018;85:1-7. doi:10.1016/j.oraloncology.2018.07.019PubMedGoogle ScholarCrossref
4.
Camisasca  DR, Honorato  J, Bernardo  V,  et al.  Expression of Bcl-2 family proteins and associated clinicopathologic factors predict survival outcome in patients with oral squamous cell carcinoma.  Oral Oncol. 2009;45(3):225-233. doi:10.1016/j.oraloncology.2008.05.021PubMedGoogle ScholarCrossref
5.
Girod  A, Mosseri  V, Jouffroy  T, Point  D, Rodriguez  J.  Women and squamous cell carcinomas of the oral cavity and oropharynx: is there something new?  J Oral Maxillofac Surg. 2009;67(9):1914-1920. doi:10.1016/j.joms.2009.04.031PubMedGoogle ScholarCrossref
6.
Thiagarajan  S, Nair  S, Nair  D,  et al.  Predictors of prognosis for squamous cell carcinoma of oral tongue.  J Surg Oncol. 2014;109(7):639-644. doi:10.1002/jso.23583PubMedGoogle ScholarCrossref
7.
Liu  F, Chen  F, Huang  J,  et al.  Prospective study on factors affecting the prognosis of oral cancer in a Chinese population.  Oncotarget. 2017;8(3):4352-4359. doi:10.18632/oncotarget.13842PubMedGoogle Scholar
8.
Jadhav  KB, Gupta  N.  Clinicopathological prognostic implicators of oral squamous cell carcinoma: need to understand and revise.  N Am J Med Sci. 2013;5(12):671-679. doi:10.4103/1947-2714.123239PubMedGoogle ScholarCrossref
9.
Bur  AM, Lin  A, Weinstein  GS.  Adjuvant radiotherapy for early head and neck squamous cell carcinoma with perineural invasion: a systematic review.  Head Neck. 2016;38(suppl 1):E2350-E2357. doi:10.1002/hed.24295PubMedGoogle ScholarCrossref
10.
Steyerberg  EW, Vergouwe  Y.  Towards better clinical prediction models: seven steps for development and an ABCD for validation.  Eur Heart J. 2014;35(29):1925-1931. doi:10.1093/eurheartj/ehu207PubMedGoogle ScholarCrossref
11.
Moons  KG, Altman  DG, Vergouwe  Y, Royston  P.  Prognosis and prognostic research: application and impact of prognostic models in clinical practice.  BMJ. 2009;338:b606. doi:10.1136/bmj.b606PubMedGoogle ScholarCrossref
12.
Piccirillo  JF.  Purposes, problems, and proposals for progress in cancer staging.  Arch Otolaryngol Head Neck Surg. 1995;121(2):145-149. doi:10.1001/archotol.1995.01890020009003PubMedGoogle ScholarCrossref
13.
Kourou  K, Exarchos  TP, Exarchos  KP, Karamouzis  MV, Fotiadis  DI.  Machine learning applications in cancer prognosis and prediction.  Comput Struct Biotechnol J. 2014;13:8-17. doi:10.1016/j.csbj.2014.11.005PubMedGoogle ScholarCrossref
14.
Bur  AM, Shew  M, New  J.  Artificial intelligence for the otolaryngologist: a state of the art review.  Otolaryngol Head Neck Surg. 2019;160(4):603-611. doi:10.1177/0194599819827507PubMedGoogle ScholarCrossref
15.
Cruz  JA, Wishart  DS.  Applications of machine learning in cancer prediction and prognosis.  Cancer Inform. 2007;2:59-77.PubMedGoogle Scholar
16.
Thrall  JH, Li  X, Li  Q,  et al.  Artificial intelligence and machine learning in radiology: opportunities, challenges, pitfalls, and criteria for success.  J Am Coll Radiol. 2018;15(3, pt B):504-508. doi:10.1016/j.jacr.2017.12.026PubMedGoogle ScholarCrossref
17.
Gupta  S, Tran  T, Luo  W,  et al.  Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry.  BMJ Open. 2014;4(3):e004007. doi:10.1136/bmjopen-2013-004007PubMedGoogle ScholarCrossref
18.
McCarthy  JF, Marx  KA, Hoffman  PE,  et al.  Applications of machine learning and high-dimensional visualization in cancer detection, diagnosis, and management.  Ann N Y Acad Sci. 2004;1020:239-262. doi:10.1196/annals.1310.020PubMedGoogle ScholarCrossref
19.
Obermeyer  Z, Emanuel  EJ.  Predicting the future—big data, machine learning, and clinical medicine.  N Engl J Med. 2016;375(13):1216-1219. doi:10.1056/NEJMp1606181PubMedGoogle ScholarCrossref
20.
Shew  M, New  J, Bur  AM.  Machine learning to predict delays in adjuvant radiation following surgery for head and neck cancer  [published online January 29, 2019].  Otolaryngol Head Neck Surg. doi:10.1177/0194599818823200PubMedGoogle Scholar
21.
Shah  ND, Steyerberg  EW, Kent  DM.  Big data and predictive analytics: recalibrating expectations.  JAMA. 2018;320(1):27-28. doi:10.1001/jama.2018.5602PubMedGoogle ScholarCrossref
22.
Hinton  G.  Deep learning—a technology with the potential to transform health care.  JAMA. 2018;320(11):1101-1102. doi:10.1001/jama.2018.11100PubMedGoogle ScholarCrossref
23.
Bilimoria  KY, Stewart  AK, Winchester  DP, Ko  CY.  The National Cancer Data Base: a powerful initiative to improve cancer care in the United States.  Ann Surg Oncol. 2008;15(3):683-690. doi:10.1245/s10434-007-9747-3PubMedGoogle ScholarCrossref
24.
Bilimoria  KY, Bentrem  DJ, Stewart  AK, Winchester  DP, Ko  CY.  Comparison of Commission on Cancer–approved and –nonapproved hospitals in the United States: implications for studies that use the National Cancer Data Base.  J Clin Oncol. 2009;27(25):4177-4181. doi:10.1200/JCO.2008.21.7018PubMedGoogle ScholarCrossref
25.
Deyo  RA, Cherkin  DC, Ciol  MA.  Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases.  J Clin Epidemiol. 1992;45(6):613-619. doi:10.1016/0895-4356(92)90133-8PubMedGoogle ScholarCrossref
26.
Graham  JW.  Missing data analysis: making it work in the real world.  Annu Rev Psychol. 2009;60:549-576. doi:10.1146/annurev.psych.58.110405.085530PubMedGoogle ScholarCrossref
27.
Pedersen  AB, Mikkelsen  EM, Cronin-Fenton  D,  et al.  Missing data and multiple imputation in clinical epidemiological research.  Clin Epidemiol. 2017;9:157-166. doi:10.2147/CLEP.S129785PubMedGoogle ScholarCrossref
28.
Karadaghy  OA, Kallogjeri  D, Piccirillo  JF.  Development of a new clinical severity staging system for patients with nonmetastatic papillary thyroid carcinoma.  JAMA Otolaryngol Head Neck Surg. 2017;143(12):1173-1180. doi:10.1001/jamaoto.2017.0550PubMedGoogle ScholarCrossref
29.
Feinstein  AR, Wells  CK.  A clinical-severity staging system for patients with lung cancer.  Medicine (Baltimore). 1990;69(1):1-33. doi:10.1097/00005792-199001000-00001PubMedGoogle ScholarCrossref
30.
Piccirillo  JF.  Importance of comorbidity in head and neck cancer.  Laryngoscope. 2000;110(4):593-602. doi:10.1097/00005537-200004000-00011PubMedGoogle ScholarCrossref
31.
Piccirillo  JF, Feinstein  AR.  Clinical symptoms and comorbidity: significance for the prognostic classification of cancer.  Cancer. 1996;77(5):834-842. doi:10.1002/(SICI)1097-0142(19960301)77:5<834::AID-CNCR5>3.0.CO;2-EPubMedGoogle ScholarCrossref
32.
Wang  G, Lam  KM, Deng  Z, Choi  KS.  Prediction of mortality after radical cystectomy for bladder cancer by machine learning techniques.  Comput Biol Med. 2015;63:124-132. doi:10.1016/j.compbiomed.2015.05.015PubMedGoogle ScholarCrossref
33.
Kallogjeri  D, Piccirillo  JF, Spitznagel  EL  Jr, Steyerberg  EW.  Comparison of scoring methods for ACE-27: simpler is better.  J Geriatr Oncol. 2012;3(3):238-245. doi:10.1016/j.jgo.2012.01.006PubMedGoogle ScholarCrossref
34.
Kallogjeri  D, Gaynor  SM, Piccirillo  ML, Jean  RA, Spitznagel  EL  Jr, Piccirillo  JF.  Comparison of comorbidity collection methods.  J Am Coll Surg. 2014;219(2):245-255. doi:10.1016/j.jamcollsurg.2014.01.059PubMedGoogle ScholarCrossref
35.
Yu  MK, Ma  J, Fisher  J, Kreisberg  JF, Raphael  BJ, Ideker  T.  Visible machine learning for biomedicine.  Cell. 2018;173(7):1562-1565. doi:10.1016/j.cell.2018.05.056PubMedGoogle ScholarCrossref
36.
Bur  AM, Holcomb  A, Goodwin  S,  et al.  Machine learning to predict occult nodal metastasis in early oral squamous cell carcinoma.  Oral Oncol. 2019;92:20-25. doi:10.1016/j.oraloncology.2019.03.011Google ScholarCrossref
×