Paper ID: 2205.11109

Illuminating Salient Contributions in Neuron Activation with Attribution Equilibrium

Woo-Jeoung Nam, Seong-Whan Lee

With the remarkable success of deep neural networks, there is a growing interest in research aimed at providing clear interpretations of their decision-making processes. In this paper, we introduce Attribution Equilibrium, a novel method to decompose output predictions into fine-grained attributions, balancing positive and negative relevance for clearer visualization of the evidence behind a network decision. We carefully analyze conventional approaches to decision explanation and present a different perspective on the conservation of evidence. We define the evidence as a gap between positive and negative influences among gradient-derived initial contribution maps. Then, we incorporate antagonistic elements and a user-defined criterion for the degree of positive attribution during propagation. Additionally, we consider the role of inactivated neurons in the propagation rule, thereby enhancing the discernment of less relevant elements such as the background. We conduct various assessments in a verified experimental environment with PASCAL VOC 2007, MS COCO 2014, and ImageNet datasets. The results demonstrate that our method outperforms existing attribution methods both qualitatively and quantitatively in identifying the key input features that influence model decisions.

Submitted: May 23, 2022