Paper ID: 2211.07774
Interpreting Bias in the Neural Networks: A Peek Into Representational Similarity
Gnyanesh Bangaru, Lalith Bharadwaj Baru, Kiran Chakravarthula
Neural networks trained on standard image classification data sets are shown to be less resistant to data set bias. It is necessary to comprehend the behavior objective function that might correspond to superior performance for data with biases. However, there is little research on the selection of the objective function and its representational structure when trained on data set with biases. In this paper, we investigate the performance and internal representational structure of convolution-based neural networks (e.g., ResNets) trained on biased data using various objective functions. We specifically study similarities in representations, using Centered Kernel Alignment (CKA), for different objective functions (probabilistic and margin-based) and offer a comprehensive analysis of the chosen ones. According to our findings, ResNets representations obtained with Negative Log Likelihood $(\mathcal{L}_{NLL})$ and Softmax Cross-Entropy ($\mathcal{L}_{SCE}$) as loss functions are equally capable of producing better performance and fine representations on biased data. We note that without progressive representational similarities among the layers of a neural network, the performance is less likely to be robust.
Submitted: Nov 14, 2022