Self Attention
Self-attention is a mechanism in neural networks that allows the model to weigh the importance of different parts of the input data when processing it, enabling the capture of long-range dependencies and contextual information. Current research focuses on improving the efficiency of self-attention, particularly in vision transformers and other large models, through techniques like low-rank approximations, selective attention, and grouped query attention, aiming to reduce computational costs while maintaining accuracy. These advancements are significantly impacting various fields, including computer vision, natural language processing, and time series analysis, by enabling more efficient and powerful models for tasks such as image restoration, text-to-image generation, and medical image segmentation.
Papers
Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model
Omid Saghatchian, Atiyeh Gh. Moghadam, Ahmad Nickabadi
HCMA-UNet: A Hybrid CNN-Mamba UNet with Inter-Slice Self-Attention for Efficient Breast Cancer Segmentation
Haoxuan Li, Wei song, Peiwu Qin, Xi Yuan, Zhenglin Chen
SAFERec: Self-Attention and Frequency Enriched Model for Next Basket Recommendation
Oleg Lashinin, Denis Krasilnikov, Aleksandr Milogradskii, Marina Ananyeva
Self-attentive Transformer for Fast and Accurate Postprocessing of Temperature and Wind Speed Forecasts
Aaron Van Poecke, Tobias Sebastian Finn, Ruoke Meng, Joris Van den Bergh, Geert Smet, Jonathan Demaeyer, Piet Termonia, Hossein Tabari, Peter Hellinckx