In Reply In our study,1 we pursued an exhaustive cross-validated grid search to identify the optimal hyperparameters for the extreme gradient descent boosting model (XGBoost), a standard approach to the selection of hyperparameters that included searching over the learning rate, number of trees trained, maximum tree depth, and minimum loss reduction required for partition on a leaf node on a tree.2 To permit comparison of area under the receiver operator characteristic (AUROC) curves, we focused on defining their variance in iterative cross validation and reported as a 95% CI. Moreover, this approach allowed comparison of other metrics, such as the precision and recall, using a consistent approach for reporting confidence intervals. As reported in the study, XGBoost did not have better discrimination for in-hospital mortality in acute myocardial infarction (AMI) than a logistic regression model (XGBoost: AUROC, 0.89; 95% CI, 0.88-0.89; logistic regression: AUROC, 0.88; 95% CI, 0.88-0.88) despite the large sample size and selection of optimal hyperparameters.1
Khera R, Mortazavi BJ, Krumholz HM. Assessing Performance of Machine Learning—Reply. JAMA Cardiol. 2021;6(12):1466. doi:10.1001/jamacardio.2021.3715
Cardiology in JAMA: Read the Latest
Customize your JAMA Network experience by selecting one or more topics from the list below.