Self Attention Module
Self-attention modules are a core component of transformer-based models, aiming to efficiently capture long-range dependencies within data sequences. Current research focuses on improving the efficiency of self-attention, particularly addressing its quadratic complexity with sequence length, through techniques like FlashAttention and various forms of sparse attention, and integrating it effectively with other modules such as in grouped residual self-attention or cascade attention blocks. These advancements are significant because they enable the application of transformer architectures to larger datasets and more complex tasks across diverse fields, including computer vision, natural language processing, and signal processing, while reducing computational costs.
Papers
ProContEXT: Exploring Progressive Context Transformer for Tracking
Jin-Peng Lan, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Bin Luo, Xu Bao, Wangmeng Xiang, Yifeng Geng, Xuansong Xie
A Generic Shared Attention Mechanism for Various Backbone Neural Networks
Zhongzhan Huang, Senwei Liang, Mingfu Liang, Liang Lin