Paper ID: 2411.08934
Predicting household socioeconomic position in Mozambique using satellite and household imagery
Carles Milà, Teodimiro Matsena, Edgar Jamisse, Jovito Nunes, Quique Bassat, Paula Petrone, Elisa Sicuri, Charfudin Sacoor, Cathryn Tonne
Many studies have predicted SocioEconomic Position (SEP) for aggregated spatial units such as villages using satellite data, but SEP prediction at the household level and other sources of imagery have not been yet explored. We assembled a dataset of 975 households in a semi-rural district in southern Mozambique, consisting of self-reported asset, expenditure, and income SEP data, as well as multimodal imagery including satellite images and a ground-based photograph survey of 11 household elements. We fine-tuned a convolutional neural network to extract feature vectors from the images, which we then used in regression analyzes to model household SEP using different sets of image types. The best prediction performance was found when modeling asset-based SEP using random forest models with all image types, while the performance for expenditure- and income-based SEP was lower. Using SHAP, we observed clear differences between the images with the largest positive and negative effects, as well as identified the most relevant household elements in the predictions. Finally, we fitted an additional reduced model using only the identified relevant household elements, which had an only slightly lower performance compared to models using all images. Our results show how ground-based household photographs allow to zoom in from an area-level to an individual household prediction while minimizing the data collection effort by using explainable machine learning. The developed workflow can be potentially integrated into routine household surveys, where the collected household imagery could be used for other purposes, such as refined asset characterization and environmental exposure assessment.
Submitted: Nov 13, 2024