Comprehensive Summary
This retrospective, single-country study asked whether modern machine-learning methods could accurately identify infants at risk for low birth weight (LBW) in Bangladesh using nationally representative survey data. Researchers applied multiple supervised learning models, including logistic regression, artificial neural networks, decision trees, random forests, LightGBM, and XGBoost, to perform binary classification of LBW versus normal birth weight. The analysis used n = 3,192 mother–child pairs from the 2022 Bangladesh Demographic and Health Survey, drawn from nationwide household interviews completed between 2017–2022. After standard preprocessing, removal of implausible values, imputation of missing data, IQR-based outlier handling, min–max scaling, and multi-method feature selection (Boruta, LASSO, Elastic Net, RF), the team addressed class imbalance with ADASYN before splitting the dataset into stratified training and test sets. The models were benchmarked against one another, with XGBoost emerging as the top performer, achieving an accuracy of 0.80, a recall of 0.80, a precision of 0.79, an F1-score of 0.77, and an AUC of 0.761 on the held-out test set. The analysis showed that 27.8% (n = 886/3,192) of infants met criteria for LBW, with higher proportions among mothers with no formal education (34.7%), households in the poorest wealth category (29.6%), rural settings (28%), and home deliveries (35.2%). LBW was especially concentrated among twin births (78.9%). Secondary analyses included multi-method feature selection and SHAP-based model interpretability, which identified gestational age, administrative division, ANC visit frequency, mode and place of delivery, birth spacing, and twin birth status as leading contributors to predictions. SHAP plots also clarified directionality: shorter gestation, low maternal education, rural residence, and limited ANC increased LBW risk, whereas C-section delivery and higher maternal education reduced predicted risk. Limitations include the cross-sectional design, reliance on maternal recall for some birth weights, potential residual confounding, and the absence of external validation or demographic fairness testing. As such, findings reflect the model’s diagnostic performance within the BDHS dataset and do not establish causal effects or clinical efficacy.
Outcomes and Implications
This study suggests that ensemble ML models, particularly XGBoost, can provide accurate, interpretable risk stratification for LBW within large-scale population health datasets. The integration of SHAP interpretability is especially valuable for policymakers and clinicians, as it translates complex model outputs into actionable insights on maternal, socioeconomic, and structural determinants. In practice, such models could be incorporated into national maternal health dashboards, enabling community health workers to flag high-risk pregnancies earlier, allocate ANC resources more effectively, and target interventions (nutrition programs, birth-spacing counseling, facility-based delivery referrals) toward vulnerable groups. However, clinical adoption requires prospective validation, assessment across diverse regions, and careful implementation to avoid reinforcing structural inequalities.