Attention Mechanism
Attention mechanisms are computational processes that selectively focus on relevant information within data, improving efficiency and performance in various machine learning models. Current research emphasizes optimizing attention's computational cost (e.g., reducing quadratic complexity to linear), enhancing its expressiveness (e.g., through convolutional operations on attention scores), and improving its robustness (e.g., mitigating hallucination in vision-language models and addressing overfitting). These advancements are significantly impacting fields like natural language processing, computer vision, and time series analysis, leading to more efficient and accurate models for diverse applications.
Papers
Follow-up Attention: An Empirical Study of Developer and Neural Model Code Exploration
Matteo Paltenghi, Rahul Pandita, Austin Z. Henley, Albert Ziegler
DPANET:Dual Pooling Attention Network for Semantic Segmentation
Dongwei Sun, Zhuolin Gao
Memory transformers for full context and high-resolution 3D Medical Segmentation
Loic Themyr, Clément Rambour, Nicolas Thome, Toby Collins, Alexandre Hostettler
LARF: Two-level Attention-based Random Forests with a Mixture of Contamination Models
Andrei V. Konstantinov, Lev V. Utkin