The study evaluates the performance of predictive models that predict infant mortality from electronic health records collected during baby delivery at the point-of-care in a large women’s hospital. We identified 507 infant deaths (cases) and 75,842 controls for the dataset. We developed six machine learning models: naïve Bayes (NB), K2 Bayesian network (K2), gradient boosting (GB), random forest (RF), ridge regression (RR), and elastic net (EN). The average performance AUC ranged between 0.88 and 0.94. GB had the best performance (AUC 0.94) with 176 variables. We believe the models can be of value to the country to accurately identify infants with high mortality risk and resources can be directed to those high-risk population accordingly.

Describe the new knowledge and additional skills the participant will gain after attending your presentation.: The study involves knowledge and methods in 1) large population predictive modeling, 2) data linkage, 3) machine learning algorithms, 4) feature selection strategy, and 5) what models worked best for the infant mortality data.


Rich Tsui (Presenter)
Children's Hospital of Philadelphia

Presentation Materials: