Paper ID: 2204.02241
A Set Membership Approach to Discovering Feature Relevance and Explaining Neural Classifier Decisions
Stavros P. Adam, Aristidis C. Likas
Neural classifiers are non linear systems providing decisions on the classes of patterns, for a given problem they have learned. The output computed by a classifier for each pattern constitutes an approximation of the output of some unknown function, mapping pattern data to their respective classes. The lack of knowledge of such a function along with the complexity of neural classifiers, especially when these are deep learning architectures, do not permit to obtain information on how specific predictions have been made. Hence, these powerful learning systems are considered as black boxes and in critical applications their use tends to be considered inappropriate. Gaining insight on such a black box operation constitutes a one way approach in interpreting operation of neural classifiers and assessing the validity of their decisions. In this paper we tackle this problem introducing a novel methodology for discovering which features are considered relevant by a trained neural classifier and how they affect the classifier's output, thus obtaining an explanation on its decision. Although, feature relevance has received much attention in the machine learning literature here we reconsider it in terms of nonlinear parameter estimation targeted by a set membership approach which is based on interval analysis. Hence, the proposed methodology builds on sound mathematical approaches and the results obtained constitute a reliable estimation of the classifier's decision premises.
Submitted: Apr 5, 2022