Counterfactual Explanation
Counterfactual explanations (CFEs) aim to enhance the interpretability of machine learning models by showing how minimal input changes would alter predictions. Current research focuses on developing robust and efficient CFE generation methods across various model types, including deep learning architectures like variational autoencoders and diffusion models, and for diverse data modalities such as images, time series, and text. This work is significant because CFEs improve model transparency and trustworthiness, fostering greater user understanding and facilitating the responsible deployment of AI in high-stakes applications like healthcare and finance.
Papers
Do Users Benefit From Interpretable Vision? A User Study, Baseline, And Dataset
Leon Sixt, Martin Schuessler, Oana-Iuliana Popescu, Philipp Weiß, Tim Landgraf
Integrating Prior Knowledge in Post-hoc Explanations
Adulam Jeyasothy, Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Marcin Detyniecki