Fast Vision Transformer
Fast Vision Transformers (FViTs) aim to overcome the computational limitations of standard Vision Transformers (ViTs) while maintaining high accuracy for computer vision tasks. Research focuses on developing efficient architectures through techniques like hierarchical attention, optimized attention mechanisms (e.g., incorporating Gabor filters or cascaded group attention), and generative architecture search, leading to models such as FasterViT, TurboViT, and EfficientViT. These advancements enable faster inference speeds and lower memory consumption, making ViTs more suitable for real-time applications in robotics, mobile devices, and other resource-constrained environments.
Papers
February 17, 2024
November 23, 2023
August 22, 2023
June 9, 2023
May 11, 2023
March 17, 2023
May 26, 2022
December 27, 2021