Surrogate Explainers

Surrogate explainers aim to make the predictions of complex, "black-box" machine learning models more understandable by approximating their behavior with simpler, interpretable models. Current research focuses on improving the accuracy and reliability of these surrogate models, addressing issues like incompatibility with certain architectures (e.g., transformers) and developing methods to quantify the uncertainty inherent in their explanations. This work is crucial for building trust in AI systems, particularly in high-stakes applications, by providing insights into how these models arrive at their decisions and enabling the assessment of explanation trustworthiness.

Papers