Attention Mechanism
Attention mechanisms are computational processes that selectively focus on relevant information within data, improving efficiency and performance in various machine learning models. Current research emphasizes optimizing attention's computational cost (e.g., reducing quadratic complexity to linear), enhancing its expressiveness (e.g., through convolutional operations on attention scores), and improving its robustness (e.g., mitigating hallucination in vision-language models and addressing overfitting). These advancements are significantly impacting fields like natural language processing, computer vision, and time series analysis, leading to more efficient and accurate models for diverse applications.
Papers
SS-BRPE: Self-Supervised Blind Room Parameter Estimation Using Attention Mechanisms
Chunxi Wang, Maoshen Jia, Meiran Li, Changchun Bao, Wenyu Jin
An Analog and Digital Hybrid Attention Accelerator for Transformers with Charge-based In-memory Computing
Ashkan Moradifirouzabadi, Divya Sri Dodla, Mingu Kang
NeuroPapyri: A Deep Attention Embedding Network for Handwritten Papyri Retrieval
Giuseppe De Gregorio, Simon Perrin, Rodrigo C. G. Pena, Isabelle Marthot-Santaniello, Harold Mouchère
Nonlocal Attention Operator: Materializing Hidden Knowledge Towards Interpretable Physics Discovery
Yue Yu, Ning Liu, Fei Lu, Tian Gao, Siavash Jafarzadeh, Stewart Silling