Sparse Transformer
Sparse transformers aim to improve the efficiency and scalability of standard transformers by reducing computational complexity, primarily by selectively attending to only a subset of input tokens. Current research focuses on developing novel sparse attention mechanisms, including various windowing strategies, hierarchical structures, and adaptive pruning techniques, often integrated into architectures like Swin Transformers and Universal Transformers. This research is significant because it enables the application of transformer models to larger datasets and more complex tasks, particularly in resource-constrained environments, with applications spanning image processing, natural language processing, and autonomous driving.
Papers
September 22, 2023
September 11, 2023
August 9, 2023
August 2, 2023
June 26, 2023
March 21, 2023
March 10, 2023
February 28, 2023
October 21, 2022
October 13, 2022
October 11, 2022
September 14, 2022
August 12, 2022
July 13, 2022
July 5, 2022
May 27, 2022
May 8, 2022
March 23, 2022