High Explainability
High explainability in artificial intelligence (AI) aims to make the decision-making processes of complex models, such as large language models and deep neural networks, more transparent and understandable. Current research focuses on developing both intrinsic (built-in) and post-hoc (added after training) explainability methods, often employing techniques like attention mechanisms, feature attribution, and counterfactual examples to interpret model outputs across various modalities (text, images, audio). This pursuit is crucial for building trust in AI systems, particularly in high-stakes domains like medicine and finance, and for ensuring fairness, accountability, and responsible AI development.
Papers
Rethinking Explainability as a Dialogue: A Practitioner's Perspective
Himabindu Lakkaraju, Dylan Slack, Yuxin Chen, Chenhao Tan, Sameer Singh
Separating Rule Discovery and Global Solution Composition in a Learning Classifier System
Michael Heider, Helena Stegherr, Jonathan Wurth, Roman Sraj, Jörg Hähner