Sparse Transformer
Sparse transformers aim to improve the efficiency and scalability of standard transformers by reducing computational complexity, primarily by selectively attending to only a subset of input tokens. Current research focuses on developing novel sparse attention mechanisms, including various windowing strategies, hierarchical structures, and adaptive pruning techniques, often integrated into architectures like Swin Transformers and Universal Transformers. This research is significant because it enables the application of transformer models to larger datasets and more complex tasks, particularly in resource-constrained environments, with applications spanning image processing, natural language processing, and autonomous driving.
Papers
November 29, 2024
November 11, 2024
November 1, 2024
October 21, 2024
October 11, 2024
August 15, 2024
August 13, 2024
July 19, 2024
July 11, 2024
June 27, 2024
June 22, 2024
June 11, 2024
May 10, 2024
April 2, 2024
March 15, 2024
March 9, 2024
February 21, 2024
January 31, 2024
December 11, 2023