Attention Layer
Attention layers are fundamental components of neural networks, particularly transformers, designed to selectively focus on relevant information within input data. Current research emphasizes improving attention's efficiency and theoretical understanding, exploring variations like sparse, hyperbolic, and grouped query attention within models such as transformers, and investigating the interplay between attention and other layers (e.g., convolutional, MLP). This work is crucial for advancing the capabilities of large language models and other deep learning architectures, impacting diverse applications from image generation and compression to natural language processing and even seismic analysis.
Papers
July 8, 2022
July 5, 2022
July 1, 2022
June 6, 2022
June 5, 2022
May 28, 2022
May 26, 2022
May 23, 2022
May 13, 2022
May 12, 2022
May 11, 2022
April 10, 2022
March 18, 2022
March 14, 2022
March 9, 2022
January 28, 2022
January 26, 2022
December 28, 2021
December 26, 2021