Attention Map
Attention maps, visualizations of the weighting process within neural networks, are crucial for understanding model decision-making and improving model performance. Current research focuses on refining attention mechanisms within transformer architectures like Vision Transformers (ViTs) and BERT, addressing issues such as artifact reduction, efficient computation (e.g., through token selection), and improved alignment with input data (e.g., text prompts or object features). This work has implications for diverse fields, including image processing, natural language processing, and even medical diagnostics, by enhancing model interpretability, accuracy, and efficiency. The development of robust and interpretable attention maps is a key area of ongoing investigation.
Papers
Background Noise Reduction of Attention Map for Weakly Supervised Semantic Segmentation
Izumi Fujimori, Masaki Oono, Masami Shishibori
LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity
Walid Bousselham, Angie Boggust, Sofian Chaybouti, Hendrik Strobelt, Hilde Kuehne
B-Cos Aligned Transformers Learn Human-Interpretable Features
Manuel Tran, Amal Lahiani, Yashin Dicente Cid, Melanie Boxberg, Peter Lienemann, Christian Matek, Sophia J. Wagner, Fabian J. Theis, Eldad Klaiman, Tingying Peng
Statistical Test for Attention Map in Vision Transformer
Tomohiro Shiraishi, Daiki Miwa, Teruyuki Katsuoka, Vo Nguyen Le Duy, Kouichi Taji, Ichiro Takeuchi