High Explainability
High explainability in artificial intelligence (AI) aims to make the decision-making processes of complex models, such as large language models and deep neural networks, more transparent and understandable. Current research focuses on developing both intrinsic (built-in) and post-hoc (added after training) explainability methods, often employing techniques like attention mechanisms, feature attribution, and counterfactual examples to interpret model outputs across various modalities (text, images, audio). This pursuit is crucial for building trust in AI systems, particularly in high-stakes domains like medicine and finance, and for ensuring fairness, accountability, and responsible AI development.
Papers
Saliency strikes back: How filtering out high frequencies improves white-box explanations
Sabine Muzellec, Thomas Fel, Victor Boutin, Léo andéol, Rufin VanRullen, Thomas Serre
R-Cut: Enhancing Explainability in Vision Transformers with Relationship Weighted Out and Cut
Yingjie Niu, Ming Ding, Maoning Ge, Robin Karlsson, Yuxiao Zhang, Kazuya Takeda
On Logic-Based Explainability with Partially Specified Inputs
Ramón Béjar, António Morgado, Jordi Planes, Joao Marques-Silva
Requirements for Explainability and Acceptance of Artificial Intelligence in Collaborative Work
Sabine Theis, Sophie Jentzsch, Fotini Deligiannaki, Charles Berro, Arne Peter Raulf, Carmen Bruder
Explainability is NOT a Game
Joao Marques-Silva, Xuanxiang Huang
Delivering Inflated Explanations
Yacine Izza, Alexey Ignatiev, Peter Stuckey, Joao Marques-Silva