Paper ID: 2402.05566
Succinct Interaction-Aware Explanations
Sascha Xu, Joscha Cüppers, Jilles Vreeken
SHAP is a popular approach to explain black-box models by revealing the importance of individual features. As it ignores feature interactions, SHAP explanations can be confusing up to misleading. NSHAP, on the other hand, reports the additive importance for all subsets of features. While this does include all interacting sets of features, it also leads to an exponentially sized, difficult to interpret explanation. In this paper, we propose to combine the best of these two worlds, by partitioning the features into parts that significantly interact, and use these parts to compose a succinct, interpretable, additive explanation. We derive a criterion by which to measure the representativeness of such a partition for a models behavior, traded off against the complexity of the resulting explanation. To efficiently find the best partition out of super-exponentially many, we show how to prune sub-optimal solutions using a statistical test, which not only improves runtime but also helps to detect spurious interactions. Experiments on synthetic and real world data show that our explanations are both more accurate resp. more easily interpretable than those of SHAP and NSHAP.
Submitted: Feb 8, 2024