Paper ID: 2208.06648

Imputation Strategies Under Clinical Presence: Impact on Algorithmic Fairness

Vincent Jeanselme, Maria De-Arteaga, Zhe Zhang, Jessica Barrett, Brian Tom

Machine learning risks reinforcing biases present in data, and, as we argue in this work, in what is absent from data. In healthcare, biases have marked medical history, leading to unequal care affecting marginalised groups. Patterns in missing data often reflect these group discrepancies, but the algorithmic fairness implications of group-specific missingness are not well understood. Despite its potential impact, imputation is often an overlooked preprocessing step, with attention placed on the reduction of reconstruction error and overall performance, ignoring how imputation can affect groups differently. Our work studies how imputation choices affect reconstruction errors across groups and algorithmic fairness properties of downstream predictions.

Submitted: Aug 13, 2022