Efficient Vision Transformer
Efficient Vision Transformers (ViTs) aim to overcome the computational limitations of standard ViTs while maintaining their strong performance in computer vision tasks. Current research focuses on developing novel attention mechanisms (e.g., polynomial attention), token reduction strategies (e.g., learnable token merging, dynamic token idling), and adaptive computation techniques to reduce the number of tokens processed based on image complexity. These advancements are significant because they enable the deployment of ViTs in resource-constrained environments like mobile devices and embedded systems, broadening their applicability in various fields.
Papers
October 19, 2024
August 30, 2024
July 21, 2024
July 16, 2024
May 16, 2024
January 3, 2024
November 20, 2023
November 2, 2023
October 9, 2023
September 21, 2023
June 10, 2023
June 7, 2023
May 24, 2023
May 17, 2023
February 22, 2023
November 14, 2022
April 18, 2022