Comprehensive Summary
This retrospective, single-center study looked at whether supervised machine learning (ML) models can predict childhood stunting and identify its major risk factors in Egypt. Researchers analyzed nationally representative Demographic and Health Survey (DHS) data from 2005, 2008, and 2014 (n=37,051 children under 59 months). After data cleaning, imputation, and class balancing, supervised classifiers, including Random Forest, Gradient Boosting, Logistic Regression, XGBoost, and k-Nearest Neighbors, were trained to categorize children as normal, stunted, or severely stunted. Model performance was evaluated using 10-fold stratified cross-validation, with metrics including accuracy, F1 score, ROC-AUC, and Cohen’s Kappa. Random Forest had the strongest results with 90.5% accuracy, F1=0.90, and AUROC=0.97, followed closely by Gradient Boosting, accuracy=90.2% and AUROC=0.97. Logistic Regression also performed well accuracy=88.5% and AUROC=0.94), while XGBoost was moderate, and k-NN showed the weakest performance. Feature importance analysis from the top-performing models highlighted key predictors of stunting, including child nutritional status, size at birth, maternal education, maternal height, household wealth index, and rural residence. No external validation was performed, and fairness analyses by demographic subgroups (e.g., gender, region) were not systematically reported. Given that the DHS dataset ended in 2014, temporal generalizability to current Egyptian children is uncertain.
Outcomes and Implications
These findings suggest that ML models, like Random Forest and Gradient Boosting, hold promise as tools for refining early risk assessment of childhood stunting in low and middle-income settings. By using routinely collected maternal and child health indicators, these models could be applied in surveillance or clinical contexts to identify vulnerable children earlier, guide targeted nutritional or social interventions, and support resource allocation. While they demonstrate strong predictive performance in retrospective datasets, their direct clinical utility remains unproven. External validation in more recent cohorts, longitudinal testing, and careful evaluation of fairness across populations will be essential before such models can be integrated into healthcare systems. This work is an important proof of concept but should be considered a preliminary step toward, rather than a replacement for, clinically validated approaches to child growth monitoring and intervention.