Surrogate Explainers

Surrogate explainers aim to make the predictions of complex, "black-box" machine learning models more understandable by approximating their behavior with simpler, interpretable models. Current research focuses on improving the accuracy and reliability of these surrogate models, addressing issues like incompatibility with certain architectures (e.g., transformers) and developing methods to quantify the uncertainty inherent in their explanations. This work is crucial for building trust in AI systems, particularly in high-stakes applications, by providing insights into how these models arrive at their decisions and enabling the assessment of explanation trustworthiness.

Papers

August 9, 2024

Axiomatic Characterisations of Sample-based Explainers
Leila Amgoud, Martin C. Cooper, Salim Debbaoui
Line by Line Explanation Black Box Axiomatic Approach Feature Explanation Using Contrasting Concept Abductive Explanation Surrogate Explainers

May 22, 2024

Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers
Tobias Leemann, Alina Fastowski, Felix Pfeiffer, Gjergji Kasneci
Transformer Megatron Decepticons Explainable AI Attention Mechanism Feature Importance Feature Attribution Feature Attribution Method Additive Model Surrogate Explainers

January 18, 2024

Eclectic Rule Extraction for Explainability of Deep Neural Network based Intrusion Detection Systems
Jesse Ables, Nathaniel Childers, William Anderson, Sudip Mittal, Shahram Rahimi, Ioana Banicescu, Maria Seale
Deep Neural Network High Explainability Explainable Artificial Intelligence Intrusion Detection System Faithful Explanation Rule Extraction Surrogate Explainers

August 8, 2022

Sampling Based On Natural Image Statistics Improves Local Surrogate Explainers
Ricardo Kleinlein, Alexander Hepburn, Raúl Santos-Rodríguez, Fernando Fernández-Martínez
Interpretable Model Open Sampling Image Statistic Surrogate Explainers Domain Index

November 17, 2021

Uncertainty Quantification of Surrogate Explanations: an Ordinal Consensus Approach
Jonas Schulz, Rafael Poyiadzi, Raul Santos-Rodriguez
Convolutional Neural Network Uncertainty Quantification Line by Line Explanation High Explainability Surrogate Explainers

Surrogate Explainers

Papers

Axiomatic Characterisations of Sample-based Explainers

Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers

Eclectic Rule Extraction for Explainability of Deep Neural Network based Intrusion Detection Systems

Sampling Based On Natural Image Statistics Improves Local Surrogate Explainers

Uncertainty Quantification of Surrogate Explanations: an Ordinal Consensus Approach