Multi Scale Vision Transformer
Multi-scale vision transformers aim to improve the performance of vision transformers by incorporating information from multiple image resolutions, addressing limitations of single-scale approaches. Current research focuses on developing efficient architectures that leverage multi-scale features through various methods, including hierarchical backbones, multi-scale attention mechanisms, and wavelet transforms, often applied to tasks like object detection, segmentation, and classification. These advancements enhance the accuracy and efficiency of vision transformers across diverse computer vision applications, particularly in handling objects of varying sizes and complexities within images and videos.
Papers
September 5, 2024
December 13, 2023
November 7, 2023
October 9, 2023
August 4, 2023
May 25, 2023
March 29, 2023
July 11, 2022