The flow diagram denotes the number of participants for each data set prior to the removal of missing covariate data. The data sets were censored for both deaths occurring before the time points and missing data. Ineligible participants included those who were enrolled early and later found not to be pregnant and those who were residing outside the study clusters.
502 648/578 633 deliveries had outcome data, 487 642/502 648 deliveries had complete predictor data for prenatal scenario, 487 537/502 648 deliveries had complete predictor data for predelivery scenario, 469 516/487 326 neonates alive on day 2 had complete predictor data for delivery/day 1 scenario, and 468 356/485 966 neonates alive on day 3 had complete predictor data for postdelivery/day 2 scenario.
EN indicates logistic elastic net; GBE, gradient boosted ensemble; NN, neural network; RF, random forest; and SVM, support vector machine with radial basis function kernel.
eTable 1. Subset Sample Sizes
eTable 2. Top Predictors by Scenario (Fresh Stillbirth Outcome)
eTable 3. Risk Score Model Coefficients and Calculation (Neonatal Mortality Outcome)
eTable 4. Mortality Probability Functions for the Total Risk Score (Neonatal Mortality Outcome)
eFigure 1. Mean (95%CI) for Validation AUC by Scenario for Outcomes of Fresh Stillbirth
eFigure 2. Probability of Mortality as a Function of Birthweight, Post-delivery/Day-2 Scenario
Customize your JAMA Network experience by selecting one or more topics from the list below.
Identify all potential conflicts of interest that might be relevant to your comment.
Conflicts of interest comprise financial interests, activities, and relationships within the past 3 years including but not limited to employment, affiliation, grants or funding, consultancies, honoraria or payment, speaker's bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued.
Err on the side of full disclosure.
If you have no conflicts of interest, check "No potential conflicts of interest" in the box below. The information will be posted with your response.
Not all submitted comments are published. Please see our commenting policy for details.
Shukla VV, Eggleston B, Ambalavanan N, et al. Predictive Modeling for Perinatal Mortality in Resource-Limited Settings. JAMA Netw Open. 2020;3(11):e2026750. doi:10.1001/jamanetworkopen.2020.26750
Can prenatal and postdelivery variables accurately predict the risk of stillbirth and neonatal deaths in resource-limited settings of low- and middle-income countries?
Using advanced machine learning–based modeling techniques on a large multicountry prospective maternal and neonatal database, this cohort study found that the prediction accuracy of models for risk of stillbirth and neonatal death using variables before delivery is low, but the prediction accuracy for neonatal death can be improved by including postdelivery variables. Birth weight was the most important predictor of neonatal mortality.
Models that include postdelivery variables have good prediction accuracy for neonatal deaths.
The overwhelming majority of fetal and neonatal deaths occur in low- and middle-income countries. Fetal and neonatal risk assessment tools may be useful to predict the risk of death.
To develop risk prediction models for intrapartum stillbirth and neonatal death.
Design, Setting, and Participants
This cohort study used data from the Eunice Kennedy Shriver National Institute of Child Health and Human Development Global Network for Women’s and Children’s Health Research population-based vital registry, including clinical sites in South Asia (India and Pakistan), Africa (Democratic Republic of Congo, Zambia, and Kenya), and Latin America (Guatemala). A total of 502 648 pregnancies were prospectively enrolled in the registry.
Risk factors were added sequentially into the data set in 4 scenarios: (1) prenatal, (2) predelivery, (3) delivery and day 1, and (4) postdelivery through day 2.
Main Outcomes and Measures
Data sets were randomly divided into 10 groups of 3 analysis data sets including training (60%), test (20%), and validation (20%). Conventional and advanced machine learning modeling techniques were applied to assess predictive abilities using area under the curve (AUC) for intrapartum stillbirth and neonatal mortality.
All prenatal and predelivery models had predictive accuracy for both intrapartum stillbirth and neonatal mortality with AUC values 0.71 or less. Five of 6 models for neonatal mortality based on delivery/day 1 and postdelivery/day 2 had increased predictive accuracy with AUC values greater than 0.80. Birth weight was the most important predictor for neonatal death in both postdelivery scenarios with independent predictive ability with AUC values of 0.78 and 0.76, respectively. The addition of 4 other top predictors increased AUC to 0.83 and 0.87 for the postdelivery scenarios, respectively.
Conclusions and Relevance
Models based on prenatal or predelivery data had predictive accuracy for intrapartum stillbirths and neonatal mortality of AUC values 0.71 or less. Models that incorporated delivery data had good predictive accuracy for risk of neonatal mortality. Birth weight was the most important predictor for neonatal mortality.
The neonatal period is the period in life with the highest risk for mortality.1 Annually, 2.5 million neonatal deaths and 2.6 million stillbirths occur globally, of which 1.3 million are intrapartum stillbirths.2 It is estimated that approximately 98% of all neonatal and perinatal deaths occur in low- and middle-income countries.3-5 However, almost all the published literature on identifying predictors of fetal and neonatal mortality and risk scoring tools are based on data from high-income countries. Data from low- and middle-income countries are limited to small sample size studies that lack validation with an independent sample. Additionally, machine learning prediction models may perform better than conventional models when applied to large data sets given their ability to delineate complex relationships and identify novel interactions between variables.6-11 Although machine learning–based prediction models are expected to perform better with large data sets, this hypothesis has not been convincingly tested with a good quality prospectively collected population database.6-8,11
We aimed to develop a risk assessment tool for intrapartum stillbirth and neonatal mortality that would include maternal and neonatal variables from a prospective multicountry maternal and neonatal database. We compared various conventional and advanced machine learning–based, analytical modeling methods at specific time points to establish individual predictive accuracies of the models. We tested the hypothesis that intrapartum stillbirth and neonatal mortality risk prediction models that include antenatal and delivery variables provide a high accuracy. Additionally, we also tested whether advanced machine learning–based models have higher predictive accuracy than a conventional logistic regression model.
The study was conducted in the Eunice Kennedy Shriver National Institute of Child Health and Human Development Global Network for Women’s and Children’s Health Research, which includes clinical sites in resource-limited settings in South Asia (India and Pakistan), Africa (Democratic Republic of Congo, Zambia, and Kenya), and Latin America (Guatemala). A population-based Global Network Maternal Newborn Health Registry (GN-MNHR) vital registry was established in 2009.12 Pregnant women and their 502 648 offspring participating in the GN-MNHR database from January 1, 2010, to December 31, 2018, were included. The description of the sites and the cluster settings has been reported.13 The GN-MNHR database includes data starting with the initial prenatal visit and up to 42 days after delivery of study participants. The GN-MNHR data processes include close quality monitoring and quality improvement interventions both at local and central levels to ensure data completeness and quality.12 The GN-MNHR definitions were used to define variables and outcomes as reported previously.13 The GN-MNHR database has been reviewed and approved by all sites’ ethics review committees and the institutional review boards at each US partner university and at the data coordinating center (RTI International). All women provided written informed consent for participation in the GN-MNHR database, including data collection and the follow-up visits. The study is reported as per the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guideline for multivariable prediction modeling reporting.14 The NICHD Global Network Maternal Newborn Health Registry is registered with ClinicalTrials.gov (NCT01073475).
The outcome variables were intrapartum stillbirth and neonatal mortality. Intrapartum stillbirth was defined as nonmacerated stillbirth presumably occurring during labor. Neonatal mortality was defined as death up to 28 days after birth. Potential risk factors (variables) for the outcome were selected from the database based on the existing literature and relevancy to the outcomes.
The risk factors were added sequentially into 4 scenario data sets: (1) prenatal (variables until first prenatal care visit), (2) predelivery (variables until just before delivery), (3) delivery and day 1 (delivery/day 1), and (4) postdelivery through day 2 (postdelivery/day 2). Day 0 was defined as the calendar day of birth. Day 1 and day 2 were defined as the subsequent calendar days. We evaluated mortality outcomes using potential risk factor sets with sequentially additional variables to determine whether additional potential risk factors improved outcome predictive accuracy.15 The first 2 scenario models (prenatal and predelivery) evaluated the outcome of intrapartum stillbirth and neonatal mortality. The third scenario model (delivery/day 1) evaluated the outcome of neonatal mortality on days 2 through 27. The fourth scenario model (postdelivery/day 2) evaluated the outcome of neonatal mortality on days 3 through 27. The delivery and postdelivery data sets were censored for deaths occurring prior to grouping time points and missing data so that only surviving neonates with complete data were included.
To build and validate predictive models, split sampling was used to set aside sections of the data for uncertainty estimation and model validation of predictive accuracy. The models considered were logistic regression and 5 machine learning models (SVM [support vector machine with radial basis function kernel], EN [logistic elastic net], NN [neural network], GBE [gradient boosted ensemble], and RF [random forest]). Data management was completed using SAS 9.4 software (SAS Institute Inc), and the model building was completed using the scikit-learn Python module. Graphics were completed using R 4.0.2 (R Project for Statistical Computing). All models except logistic regression were tuned using 10-fold cross-validation on training data, and then each tuned model was applied to the test data for a predictive accuracy assessment. For assessment of the consistency of accuracy, the tuning was repeated on training plus test data, and tuned models were applied to the validation data. The predictive accuracy was assessed using the area under the curve (AUC) of the receiver operating characteristic (ROC) curves. Because the result from any accuracy assessment using randomly split data is random, the entire analysis was repeated within each of 10 mutually exclusive data subsets for each scenario. This enabled us to have 10 assessments of accuracy for each model within each scenario, allowing an assessment of the uncertainty in the estimated accuracy for all models. Paired t tests were used on these 10 estimates of accuracy to compare models in order to descriptively assess whether or not the models within a scenario were discernably different in light of the uncertainty. The process of building and validating the best predictive machine learning model and using the results to build a modified logistic regression model for mortality risk scoring is described in the eAppendix in the Supplement.
After the removal of missing data and deaths before grouping time points, the prenatal data set contained 487 642 neonates, the predelivery data set contained 487 537 neonates, the delivery/day 1 data set contained 469 516 neonates, and the postdelivery/day 2 data set contained 468 356 neonates (Figure 1). The sex distribution of the neonates was 51.6% male and 48.4% female in the prenatal data set. Baseline maternal and neonatal variables vary slightly for each of the 4 data sets owing to censoring (Table 1). The sample sizes for each subset before splitting into training, test, and validation data sets also vary slightly, which reflects variation owing to missing data (eTable 1 in the Supplement).
Models using either only prenatal or prenatal and predelivery variables had predictive accuracy for intrapartum stillbirth and neonatal mortality of AUC values 0.71 or less (Figure 2). The analysis of models for intrapartum stillbirth showed that all prenatal models had AUC values of 0.63 or less and all predelivery models had AUC values of 0.72 or less (eFigure 1 in the Supplement). Cluster perinatal mortality was the most important predictor of intrapartum stillbirth in the prenatal data set (AUC, 0.60) and antepartum hemorrhage was the most important predictor in the predelivery data set (AUC, (0.56) (eTable 2 in the Supplement). Other important predictors of intrapartum stillbirth in the prenatal data set were gestational age at enrollment, maternal age, birth order, and parity. Other important predictors of intrapartum stillbirth in the predelivery data set were cluster perinatal mortality; gestational age at enrollment; hypertension, severe pre-eclampsia, or eclampsia; and maternal age.
The predictive models based on the data sets that included delivery/day 1 and postdelivery/day 2 variables had good predictive accuracy for neonatal mortality, with 5 of the 6 models having AUCs above 0.80. Birth weight was the most important predictor in both the delivery/day 1 and postdelivery/day 2 scenarios, with independent predictive ability of AUC 0.78 and 0.76, respectively. The increase in probability of mortality with decreasing in birth weight occurred in both the delivery/day 1 and postdelivery/day 2 scenarios (Figure 3; eFigure 2 in the Supplement). Bag and mask resuscitation, gestational age, cluster perinatal mortality rate, and maternal age were the other top predictors for the delivery/day 1 scenario. Conditions requiring hospitalization, antibiotics, gestational age, and bag and mask resuscitation were the other top predictors for the postdelivery/day 2 scenario. The addition of these other top predictors resulted in increases in the AUCs (0.83 and 0.87 for the delivery/day 1 and postdelivery/day 2 scenarios, respectively) relative to birth weight alone.
For models assessing the outcomes of stillbirth and neonatal mortality, the pairwise paired t test showed that there were statistically insignificant differences between AUC values illustrated by considerable overlap of confidence intervals (Figure 2). However, gradient boosted ensemble and random forest models were consistently among the best-performing models. Even though the logistic regression model was not the best-performing model in any scenario, the AUC of the logistic regression model was not significantly different than the top-performing models.
The sequential addition of variables was done to identify the individual relative contribution of each variable to AUC for combined outcomes of intrapartum stillbirth and neonatal mortality based on validation data set (Table 2) and to develop a risk scoring system for stillbirth and neonatal mortality. Table 2 lists the changes in AUC as predictors are added to the model using the ordering of predictor importance (see eAppendix in the Supplement for more details about predictor importance). A modified logistic regression model based on findings from the best machine learning model as well as results from a variable selection study using logistic regression and the least absolute shrinkage and selection operator (LASSO) method for identification of potential interactions was fit to create a risk scoring system for the delivery/day 1 and postdelivery/day 2 data (eTable 3 in the Supplement). Using risk scores calculated from this risk scoring system, a logistic regression model was fit to the total risk scores in order to derive a formula for predicting the probability of mortality given the calculated risk score. The logistic model for neonatal mortality risk scoring had an AUC value on validation data equal to 0.809 (97% of the best model) for the delivery/day 1 scenario and 0.845 (97% of the best model) for the postdelivery/day 2 scenario (eTable 4 in the Supplement).
This cohort study found that predictive models using only prenatal or prenatal and predelivery variables had predictive accuracy for intrapartum stillbirth and neonatal mortality of AUC values of 0.71 or less. We identified that a better neonatal mortality risk prediction could be made when variables obtained immediate postdelivery and up to 2 days after birth were included in the models, with AUCs increasing up to 0.87. Birth weight was identified as the most important predictor for neonatal mortality among all variables considered. The contribution of other predictors to the AUC increase was relatively minor.
Many studies have analyzed variables associated with increased risk for stillbirths or neonatal mortality, but only 1 study16 used a relatively large group of neonates to develop predictive models. As found in the current study, low predictive accuracy (AUC = 0.58, 95% CI = 0.56–0.59) using prenatal variables was reported in a study based on 10 sites from 3 countries in South Asia (N = 49 632).16 Similar to the findings of the current study, the predictive accuracy of the model improved when postdelivery variables were included (AUC = 0.83, 95% CI = 0.79–0.86), but only logistic regression modeling was applied to develop the model, and the results were not validated with an independent sample. The current results are consistent with the data from a large prospective neonatal database of participants living in a high-income country, which showed that inclusion of delivery variables (especially birth weight) results in better accuracy for predicting neonatal mortality than models with only prenatal variables.15
Some of the important predictors in the current study have been reported to be associated with an increased risk for neonatal mortality in simple association analyses. In a pooled analysis of data from low- and middle-income countries, small for gestational age and prematurity were found to be associated with increased risk of neonatal mortality in association (bivariate) analysis, although birth weight was not specifically analyzed.17 Inferences from pooled analyses can raise questions of quality, accuracy, and generalizability.18 Additionally, vital registries from low- and middle-income countries are of questionable accuracy.19 The novel finding of the current study that birth weight is the most important variable for predicting the risk of neonatal mortality provides the strongest evidence based on a high-quality large prospective population-based database from resource-limited settings. To our knowledge, the current study is the largest study that uses a quantitative approach for risk prediction of stillbirth and neonatal mortality of data from low- and middle-income countries. This study identifies the contribution of birth weight as a continuous variable to arrive at its independent predictive ability for risk of neonatal mortality. In studies of association with increased risk of all-cause stillbirth or neonatal mortality, many variables have been identified, but these analyses were limited to descriptive analyses. Intrapartum stillbirth and early neonatal mortality have overlapping causes (in contrast to early stillbirth20), so we have included only intrapartum stillbirth in the present study. Adjusted analysis based on an earlier cohort of the GN-MNHR database indicated associations between maternal age younger than 20 or older than 35 years, lower maternal education, 0 parity or 3 parity greater than 3, and no prenatal care with all-cause stillbirth.21,22 Using other databases, other factors associated with all-cause stillbirth were poverty, parity of 5 or more, prematurity, low birth weight, and previous stillbirth.20 Factors associated with neonatal mortality also included maternal age, education, parity, multiple gestations, birth order, suspected maternal sepsis, antepartum hemorrhage, eclampsia, and obstructed labor.23-25 Also, in a large pooled analysis of cross-sectional data from 57 low- and middle-income countries (N = 464 728), antenatal care was associated with decrease in neonatal mortality in univariate analyses.26
In large modeling studies from high-income countries of extremely preterm15,27 or less than 2000 g neonates28 admitted to neonatal intensive care units, birth weight was among the most important common factors associated with hospital mortality. With the model derived from a high-income country, birth weight was also found to be a factors associated with hospital mortality when it was applied to a sample of 550 neonates from a single center in The Gambia.28 In another study from a developed country database of extreme preterm neonates, birth weight and gestational age were found to be equally associated with 2-year mortality.29 Gestational age was also among the top factors associated with mortality in the earlier developed country database studies.15,27 However, in the current study, birth weight was found to be a better predictor of neonatal mortality than gestational age. This could be related to estimation errors of gestational age based on the last menstrual period and limited availability of first-trimester ultrasonography-based dating confirmation.30 Small for gestational age (not specifically birth weight) has been associated with higher risk of perinatal and neonatal mortality and morbidity in data from high-income countries in bivariate analysis.31 Similar to the current study, neonatal sex, resuscitation, and antenatal corticosteroids have been factors associated with neonatal mortality in data from high-income countries.32-34
This study used an established database with data that are population-based and prospectively collected with multiple quality assurance checks. The study was hypothesis-driven with only prespecified analyses performed. This large sample size study was adequately powered for prediction of the risk for intrapartum stillbirth and neonatal mortality in the represented resource-limited settings in low- and middle-income countries. Because of the large sample size, we were able to evaluate rigorously the machine learning predictive models. Predictive accuracy was assessed on test and validation data sets to check for consistency in predictive accuracy estimates, and the modeling building process was completed 10 times to quantify uncertainty in predicted model accuracy. Although generalizability of the results could be questioned, the results are likely to be pertinent to many communities in resource-limited regions similar to those of the study settings. There is a possibility of potential confounders like health status, availability of care, interventions, and health policy that were not captured. Nonetheless, as the study results are based on a large number of participants over 9 years, the potential impact of confounders on the study result should be low. Additionally, the use of the database did not include high-definition data, including stratification of individual risk factors as per illness severity, extensive laboratory test results, or details of treatments received and clinical response. As the risk score is intended to be useful for all health care professionals, adding more complexity to the scoring method by including variables that need a higher level of training and resources might have reduced the ease of application and usability of the score. However, incorporating additional variables could have improved the predictive accuracy, especially for the prenatal models. The gestational age variable would be prone to estimation errors depending on the mother’s accounting of the last menstrual period and the availability of more accurate assessments such as first-trimester ultrasonography. Intrapartum and antepartum stillbirth differentiation can be associated with identification errors, but training of the health care professionals and several quality checks were made to minimize this error.
In the current study, prediction of the risk of intrapartum stillbirth alone or in combination with neonatal mortality based on prenatal or predelivery data had predictive accuracy of AUC values of only 0.72 or less. The best risk prediction for neonatal death was only achieved after including delivery and early neonatal variables, which can be used to identify neonates at the highest risk for mortality who may need specialized care. Birth weight was by far the most important predictor for neonatal mortality, while the contribution of other variables was relatively minor. Mortality risk–based triage and referral could be tested as a strategy to reduce the burden of neonatal deaths in resource-limited settings. Given these findings, prenatal and predelivery data are not sufficient to develop strategies to identify those who are at a high risk of perinatal mortality and require advanced care at birth and referral. Birth weight could be prioritized in the identification of neonates at risk for dying. Predelivery estimation of birth weight could be evaluated as a strategy for predelivery triage and referral.
Accepted for Publication: September 27, 2020.
Published: November 18, 2020. doi:10.1001/jamanetworkopen.2020.26750
Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2020 Shukla VV et al. JAMA Network Open.
Corresponding Author: Waldemar A. Carlo, MD, Division of Neonatology, University of Alabama at Birmingham, 1700 6th Ave S, Ste 9380 WIC, Birmingham, AL 35249 (firstname.lastname@example.org).
Author Contributions: Mr Eggleston and Dr McClure had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Shukla, Ambalavanan, McClure, Mwenechanya, Chomba, Bose, Bauserman, Goudar, Derman, Garces, Saleem, Goldenberg, Patel, Esamai, Carlo.
Acquisition, analysis, or interpretation of data: Shukla, Eggleston, Ambalavanan, McClure, Tshefu, Goudar, Derman, Garces, Krebs, Goldenberg, Patel, Hibberd, Bucher, Liechty, Koso-Thomas.
Drafting of the manuscript: Shukla, Eggleston, Mwenechanya, Esamai.
Critical revision of the manuscript for important intellectual content: Shukla, Eggleston, Ambalavanan, McClure, Chomba, Bose, Bauserman, Tshefu, Goudar, Derman, Garces, Krebs, Saleem, Goldenberg, Patel, Hibberd, Esamai, Bucher, Liechty, Koso-Thomas, Carlo.
Statistical analysis: Shukla, Eggleston, McClure, Goldenberg, Carlo.
Obtained funding: McClure, Derman, Krebs, Goldenberg, Patel, Hibberd, Liechty.
Administrative, technical, or material support: McClure, Mwenechanya, Bauserman, Tshefu, Goudar, Derman, Krebs, Saleem, Goldenberg, Patel, Esamai, Bucher, Liechty, Koso-Thomas, Carlo.
Supervision: McClure, Chomba, Tshefu, Goudar, Garces, Krebs, Saleem, Goldenberg, Patel, Hibberd, Esamai, Liechty.
Conflict of Interest Disclosures: Dr Eggleston reported grants from NICHD during the conduct of the study. Dr Hibberd reported grants from NIH during the conduct of the study. Dr Carlo reported personal fees from Mednax and serving on the company’s board of directors outside the submitted work. No other disclosures were reported.
Funding/Support: The Global Network is funded through grants from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (U01 HD040477; U10 HD076465; U10 HD078437; U10 HD076474; U10 HD076457; U10 HD078438; U10 HD078439; U10 HD076461; U01 HD040636).
Role of the Funder/Sponsor: A physician (Dr Koso-Thomas) from the funder (NICHD) and member of the research team had input into the design and conduct of the study, review and approval of the manuscript. Otherwise, the funder had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Disclaimer: The authors’ views do not necessarily represent those of the NICHD. The article was approved for publication by NICHD through its clearance mechanism.
Additional Information: Data from the study are available at the NICHD data repository (NDASH): https://dash.nichd.nih.gov/.
Create a personal account or sign in to: