Surrogate Explainers
Surrogate explainers aim to make the predictions of complex, "black-box" machine learning models more understandable by approximating their behavior with simpler, interpretable models. Current research focuses on improving the accuracy and reliability of these surrogate models, addressing issues like incompatibility with certain architectures (e.g., transformers) and developing methods to quantify the uncertainty inherent in their explanations. This work is crucial for building trust in AI systems, particularly in high-stakes applications, by providing insights into how these models arrive at their decisions and enabling the assessment of explanation trustworthiness.
Papers
August 9, 2024
May 22, 2024
January 18, 2024
August 8, 2022