Self Attention
Self-attention is a mechanism in neural networks that allows the model to weigh the importance of different parts of the input data when processing it, enabling the capture of long-range dependencies and contextual information. Current research focuses on improving the efficiency of self-attention, particularly in vision transformers and other large models, through techniques like low-rank approximations, selective attention, and grouped query attention, aiming to reduce computational costs while maintaining accuracy. These advancements are significantly impacting various fields, including computer vision, natural language processing, and time series analysis, by enabling more efficient and powerful models for tasks such as image restoration, text-to-image generation, and medical image segmentation.
Papers
Systematic Architectural Design of Scale Transformed Attention Condenser DNNs via Multi-Scale Class Representational Response Similarity Analysis
Andre Hryniowski, Alexander Wong
Label-noise-tolerant medical image classification via self-attention and self-supervised learning
Hongyang Jiang, Mengdi Gao, Yan Hu, Qiushi Ren, Zhaoheng Xie, Jiang Liu
Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation
Yingyi Chen, Qinghua Tao, Francesco Tonin, Johan A. K. Suykens
LAIT: Efficient Multi-Segment Encoding in Transformers with Layer-Adjustable Interaction
Jeremiah Milbauer, Annie Louis, Mohammad Javad Hosseini, Alex Fabrikant, Donald Metzler, Tal Schuster
Recasting Self-Attention with Holographic Reduced Representations
Mohammad Mahmudul Alam, Edward Raff, Stella Biderman, Tim Oates, James Holt