Attention Pattern
Attention patterns in neural networks, particularly transformers, are a focus of intense research aiming to understand how these models process information and make decisions. Current work investigates attention mechanisms across various model architectures, including vision transformers and large language models, analyzing how attention weights relate to model performance, human attention, and the presence of adversarial examples or biases. Understanding and potentially controlling these patterns is crucial for improving model interpretability, robustness, efficiency, and ultimately, building more reliable and trustworthy AI systems across diverse applications like medical image analysis and natural language processing.
Papers
Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
Cong Wei, Brendan Duke, Ruowei Jiang, Parham Aarabi, Graham W. Taylor, Florian Shkurti
How Does Attention Work in Vision Transformers? A Visual Analytics Attempt
Yiran Li, Junpeng Wang, Xin Dai, Liang Wang, Chin-Chia Michael Yeh, Yan Zheng, Wei Zhang, Kwan-Liu Ma