Self Attention Network
Self-attention networks are a core component of transformer architectures, aiming to improve the efficiency and effectiveness of processing sequential data by weighting the importance of different elements within the sequence. Current research focuses on enhancing self-attention's performance and interpretability through modifications like alternative activation functions (e.g., sigmoid instead of softmax), exploring the geometric properties of these networks, and developing efficient ensemble methods for uncertainty quantification. These advancements are impacting various fields, including natural language processing, computer vision, and time series analysis, by enabling more accurate and robust models for diverse tasks.
Papers
October 31, 2024
October 15, 2024
September 6, 2024
August 30, 2024
August 4, 2024
May 23, 2024
May 20, 2024
February 3, 2024
November 29, 2023
November 23, 2023
November 6, 2023
May 31, 2023
May 16, 2023
February 13, 2023
November 28, 2022
July 16, 2022
June 1, 2022
May 17, 2022
May 4, 2022
March 7, 2022