Robust Explanation
Robust explanation in machine learning aims to create explanations for model predictions that are reliable and consistent, even when faced with adversarial attacks or changes in input data. Current research focuses on improving the robustness of various explanation methods, including counterfactual explanations, saliency maps, and prototype-based approaches, often applied to deep neural networks and ensemble methods like random forests. This work is crucial for building trust in AI systems, particularly in high-stakes applications where understanding and verifying model decisions is paramount, and for mitigating the risks associated with unreliable or easily manipulated explanations.
Papers
December 2, 2022
November 9, 2022
September 18, 2022
June 24, 2022
June 7, 2022