Attention Mechanism
Attention mechanisms are computational processes that selectively focus on relevant information within data, improving efficiency and performance in various machine learning models. Current research emphasizes optimizing attention's computational cost (e.g., reducing quadratic complexity to linear), enhancing its expressiveness (e.g., through convolutional operations on attention scores), and improving its robustness (e.g., mitigating hallucination in vision-language models and addressing overfitting). These advancements are significantly impacting fields like natural language processing, computer vision, and time series analysis, leading to more efficient and accurate models for diverse applications.
Papers
A Primal-Dual Framework for Transformers and Neural Networks
Tan M. Nguyen, Tam Nguyen, Nhat Ho, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher
Elliptical Attention
Stefan K. Nielsen, Laziz U. Abdullaev, Rachel S.Y. Teo, Tan M. Nguyen
Guided Context Gating: Learning to leverage salient lesions in retinal fundus images
Teja Krishna Cherukuri, Nagur Shareef Shaik, Dong Hye Ye
Memory-Efficient Sparse Pyramid Attention Networks for Whole Slide Image Analysis
Weiyi Wu, Chongyang Gao, Xinwen Xu, Siting Li, Jiang Gui
Optimizing Visual Question Answering Models for Driving: Bridging the Gap Between Human and Machine Attention Patterns
Kaavya Rekanar, Martin Hayes, Ganesh Sistu, Ciaran Eising
Progressive Confident Masking Attention Network for Audio-Visual Segmentation
Yuxuan Wang, Feng Dong, Jinchao Zhu
Iteration Head: A Mechanistic Study of Chain-of-Thought
Vivien Cabannes, Charles Arnal, Wassim Bouaziz, Alice Yang, Francois Charton, Julia Kempe
EchoMamba4Rec: Harmonizing Bidirectional State Space Models with Spectral Filtering for Advanced Sequential Recommendation
Yuda Wang, Xuxin He, Shengxin Zhu