Attention Mechanism
Attention mechanisms are computational processes that selectively focus on relevant information within data, improving efficiency and performance in various machine learning models. Current research emphasizes optimizing attention's computational cost (e.g., reducing quadratic complexity to linear), enhancing its expressiveness (e.g., through convolutional operations on attention scores), and improving its robustness (e.g., mitigating hallucination in vision-language models and addressing overfitting). These advancements are significantly impacting fields like natural language processing, computer vision, and time series analysis, leading to more efficient and accurate models for diverse applications.
Papers
When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants
Anuj Diwan, Eunsol Choi, David Harwath
Investigating the dynamics of hand and lips in French Cued Speech using attention mechanisms and CTC-based decoding
Sanjana Sankar, Denis Beautemps, Frédéric Elisei, Olivier Perrotin, Thomas Hueber
VISIT: Visualizing and Interpreting the Semantic Information Flow of Transformers
Shahar Katz, Yonatan Belinkov
Causal-Based Supervision of Attention in Graph Neural Network: A Better and Simpler Choice towards Powerful Attention
Hongjun Wang, Jiyuan Chen, Lun Du, Qiang Fu, Shi Han, Xuan Song
Enhancing Next Active Object-based Egocentric Action Anticipation with Guided Attention
Sanket Thakur, Cigdem Beyan, Pietro Morerio, Vittorio Murino, Alessio Del Bue