Image Transformer
Image transformers leverage the power of self-attention mechanisms, initially developed for natural language processing, to analyze and manipulate images and videos. Current research focuses on improving efficiency (e.g., through techniques like group-shifted window attention and wavelet transforms), expanding applications (including image restoration, inpainting, generation, and video understanding), and addressing challenges like memory consumption and bias in model outputs. This rapidly evolving field is significantly impacting computer vision, enabling advancements in diverse areas such as medical image analysis, robotic interaction, and creative content generation.
Papers
May 20, 2022
April 16, 2022
April 8, 2022
March 29, 2022
March 9, 2022
December 19, 2021