Attention Diversification Regularization

Attention diversification regularization aims to improve the robustness and generalizability of machine learning models by encouraging them to focus on a wider range of features, rather than over-emphasizing a limited subset. Current research focuses on applying this technique within various architectures, including Vision Transformers and convolutional neural networks, often using methods that maximize attention entropy or impose constraints on the similarity between attention maps from different model components. This approach addresses the problem of overfitting and shortcut learning, leading to improved performance in diverse applications such as object re-identification, whole slide image classification, and open-set recognition, ultimately advancing the reliability and applicability of machine learning models in real-world scenarios.

Papers