Vision Transformer
Vision Transformers (ViTs) adapt the transformer architecture, initially designed for natural language processing, to image analysis by treating images as sequences of patches. Current research focuses on improving ViT efficiency and robustness through techniques like token pruning, attention engineering, and hybrid models combining ViTs with convolutional neural networks or other architectures (e.g., Mamba). These advancements are driving progress in various applications, including medical image analysis, object detection, and spatiotemporal prediction, by offering improved accuracy and efficiency compared to traditional convolutional neural networks in specific tasks.
Papers
Image Classification of Stroke Blood Clot Origin using Deep Convolutional Neural Networks and Visual Transformers
David Azatyan
Making Vision Transformers Truly Shift-Equivariant
Renan A. Rojas-Gomez, Teck-Yian Lim, Minh N. Do, Raymond A. Yeh
Imitating Task and Motion Planning with Visuomotor Transformers
Murtaza Dalal, Ajay Mandlekar, Caelan Garrett, Ankur Handa, Ruslan Salakhutdinov, Dieter Fox
Efficient Large-Scale Visual Representation Learning And Evaluation
Eden Dolev, Alaa Awad, Denisa Roberts, Zahra Ebrahimzadeh, Marcin Mejran, Vaibhav Malpani, Mahir Yavuz
U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech
Xin Jing, Yi Chang, Zijiang Yang, Jiangjian Xie, Andreas Triantafyllopoulos, Bjoern W. Schuller
Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
Ibrahim Alabdulmohsin, Xiaohua Zhai, Alexander Kolesnikov, Lucas Beyer
VanillaNet: the Power of Minimalism in Deep Learning
Hanting Chen, Yunhe Wang, Jianyuan Guo, Dacheng Tao
Transfer Learning for Fine-grained Classification Using Semi-supervised Learning and Visual Transformers
Manuel Lagunas, Brayan Impata, Victor Martinez, Virginia Fernandez, Christos Georgakis, Sofia Braun, Felipe Bertrand
A survey of the Vision Transformers and its CNN-Transformer based Variants
Asifullah Khan, Zunaira Rauf, Anabia Sohail, Abdul Rehman, Hifsa Asif, Aqsa Asif, Umair Farooq