Paper ID: 2412.05813

Risk factor identification and classification of malnutrition among under-five children in Bangladesh: Machine learning and statistical approach

Tasfin Mahmud, Tayab Uddin Wara, Chironjeet Das Joy

This study aims to understand the factors that resulted in under-five children's malnutrition from the Multiple Indicator Cluster (MICS-2019) nationwide surveys and classify different malnutrition stages based on the four well-established machine learning algorithms, namely - Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), and Multi-layer Perceptron (MLP) neural network. Accuracy, precision, recall, and F1 scores are obtained to evaluate the performance of each model. The statistical Pearson correlation coefficient analysis is also done to understand the significant factors related to a child's malnutrition. The eligible data sample for analysis was 21,858 among 24,686 samples from the dataset. Satisfactory and insightful results were obtained in each case and, the RF and MLP performed extraordinarily well. For RF, the accuracy was 98.55%, average precision 98.3%, recall value 95.68%, and F1 score 97.13%. For MLP, the accuracy was 98.69%, average precision 97.62%, recall 90.96%, and F1 score of 97.39%. From the Pearson co-efficient, all negative correlation results are enlisted, and the most significant impacts are found for the WAZ2 (Weight for age Z score WHO) (-0.828"), WHZ2 (Weight for height Z score WHO) (-0.706"), ZBMI (BMI Z score WHO) (-0.656"), BD3 (whether child is still being breastfed) (-0.59"), HAZ2 (Height for age Z score WHO) (-0.452"), CA1 (whether child had diarrhea in last 2 weeks) (-0.34"), Windex5 (Wealth index quantile) (-0.161"), melevel (Mother's education) (-0.132"), and CA14/CA16/CA17 (whether child had illness with fever, cough, and breathing) (-0.04) in successive order.

Submitted: Dec 8, 2024