Softmax Attention
Softmax attention, a core component of transformer networks, calculates weighted sums of input elements based on pairwise similarities, but its quadratic complexity limits scalability. Current research focuses on developing alternative attention mechanisms, such as linear attention, cosine attention, and sigmoid attention, to reduce computational cost while maintaining accuracy, often employing techniques like kernel methods, vector quantization, or novel normalization strategies. These efforts aim to improve the efficiency and applicability of transformer models for long sequences and large-scale applications in natural language processing, computer vision, and beyond.
Papers
April 1, 2024
March 13, 2024
February 29, 2024
February 18, 2024
January 30, 2024
January 7, 2024
December 11, 2023
October 20, 2023
October 18, 2023
October 8, 2023
October 6, 2023
September 28, 2023
September 15, 2023
August 24, 2023
July 17, 2023
June 30, 2023
June 23, 2023
June 6, 2023
May 18, 2023