Attention Layer
Attention layers are fundamental components of neural networks, particularly transformers, designed to selectively focus on relevant information within input data. Current research emphasizes improving attention's efficiency and theoretical understanding, exploring variations like sparse, hyperbolic, and grouped query attention within models such as transformers, and investigating the interplay between attention and other layers (e.g., convolutional, MLP). This work is crucial for advancing the capabilities of large language models and other deep learning architectures, impacting diverse applications from image generation and compression to natural language processing and even seismic analysis.
Papers
October 25, 2023
October 17, 2023
September 23, 2023
September 20, 2023
September 14, 2023
August 31, 2023
August 20, 2023
July 28, 2023
July 21, 2023
July 17, 2023
July 6, 2023
June 21, 2023
June 19, 2023
June 16, 2023
June 5, 2023
June 3, 2023
June 1, 2023
May 29, 2023
May 27, 2023