Paper ID: 2202.00345
Exploring layerwise decision making in DNNs
Coenraad Mouton, Marelie H. Davel
While deep neural networks (DNNs) have become a standard architecture for many machine learning tasks, their internal decision-making process and general interpretability is still poorly understood. Conversely, common decision trees are easily interpretable and theoretically well understood. We show that by encoding the discrete sample activation values of nodes as a binary representation, we are able to extract a decision tree explaining the classification procedure of each layer in a ReLU-activated multilayer perceptron (MLP). We then combine these decision trees with existing feature attribution techniques in order to produce an interpretation of each layer of a model. Finally, we provide an analysis of the generated interpretations, the behaviour of the binary encodings and how these relate to sample groupings created during the training process of the neural network.
Submitted: Feb 1, 2022