Paper ID: 2411.08048

Equitable Length of Stay Prediction for Patients with Learning Disabilities and Multiple Long-term Conditions Using Machine Learning

Emeka Abakasanga, Rania Kousovista, Georgina Cosma, Ashley Akbari, Francesco Zaccardi, Navjot Kaur, Danielle Fitt, Gyuchan Thomas Jun, Reza Kiani, Satheesh Gangadharan

People with learning disabilities have a higher mortality rate and premature deaths compared to the general public, as reported in published research in the UK and other countries. This study analyses hospitalisations of 9,618 patients identified with learning disabilities and long-term conditions for the population of Wales using electronic health record (EHR) data sources from the SAIL Databank. We describe the demographic characteristics, prevalence of long-term conditions, medication history, hospital visits, and lifestyle history for our study cohort, and apply machine learning models to predict the length of hospital stays for this cohort. The random forest (RF) model achieved an Area Under the Curve (AUC) of 0.759 (males) and 0.756 (females), a false negative rate of 0.224 (males) and 0.229 (females), and a balanced accuracy of 0.690 (males) and 0.689 (females). After examining model performance across ethnic groups, two bias mitigation algorithms (threshold optimization and the reductions algorithm using an exponentiated gradient) were applied to minimise performance discrepancies. The threshold optimizer algorithm outperformed the reductions algorithm, achieving lower ranges in false positive rate and balanced accuracy for the male cohort across the ethnic groups. This study demonstrates the potential of applying machine learning models with effective bias mitigation approaches on EHR data sources to enable equitable prediction of hospital stays by addressing data imbalances across groups.

Submitted: Nov 3, 2024