Paper ID: 2303.00517

Analyzing Impact of Socio-Economic Factors on COVID-19 Mortality Prediction Using SHAP Value

Redoan Rahman, Jooyeong Kang, Justin F Rousseau, Ying Ding

This paper applies multiple machine learning (ML) algorithms to a dataset of de-identified COVID-19 patients provided by the COVID-19 Research Database. The dataset consists of 20,878 COVID-positive patients, among which 9,177 patients died in the year 2020. This paper aims to understand and interpret the association of socio-economic characteristics of patients with their mortality instead of maximizing prediction accuracy. According to our analysis, a patients households annual and disposable income, age, education, and employment status significantly impacts a machine learning models prediction. We also observe several individual patient data, which gives us insight into how the feature values impact the prediction for that data point. This paper analyzes the global and local interpretation of machine learning models on socio-economic data of COVID patients.

Submitted: Feb 27, 2023