Local Explanation
Local explanation methods aim to make the decisions of complex, "black-box" machine learning models more transparent and understandable. Current research focuses on improving the faithfulness and reliability of these explanations, particularly for large language models and deep learning architectures like convolutional neural networks and gradient boosting trees, often employing techniques like surrogate models, counterfactual analysis, and topological data analysis for comparison and visualization. This work is crucial for building trust in AI systems across various domains, from healthcare to finance, by providing users with insights into model behavior and identifying potential biases or vulnerabilities.
Papers
The Disagreement Problem in Faithfulness Metrics
Brian Barr, Noah Fatsi, Leif Hancox-Li, Peter Richter, Daniel Proano, Caleb Mok
Explaining black boxes with a SMILE: Statistical Model-agnostic Interpretability with Local Explanations
Koorosh Aslansefat, Mojgan Hashemian, Martin Walker, Mohammed Naveed Akram, Ioannis Sorokos, Yiannis Papadopoulos