Attention Layer
Attention layers are fundamental components of neural networks, particularly transformers, designed to selectively focus on relevant information within input data. Current research emphasizes improving attention's efficiency and theoretical understanding, exploring variations like sparse, hyperbolic, and grouped query attention within models such as transformers, and investigating the interplay between attention and other layers (e.g., convolutional, MLP). This work is crucial for advancing the capabilities of large language models and other deep learning architectures, impacting diverse applications from image generation and compression to natural language processing and even seismic analysis.
Papers
February 28, 2023
February 21, 2023
February 14, 2023
February 10, 2023
February 8, 2023
January 22, 2023
January 20, 2023
January 17, 2023
December 20, 2022
November 11, 2022
October 23, 2022
October 16, 2022
October 4, 2022
October 2, 2022
September 28, 2022
September 13, 2022
August 18, 2022
August 9, 2022