Image Modeling
Image modeling aims to learn efficient representations of images, enabling tasks like image generation, recognition, and manipulation. Current research focuses on self-supervised learning techniques, particularly masked image modeling (MIM), which trains models to reconstruct missing image parts, and on improving the interpretability and robustness of these models through methods like generalized integrated gradients. These advancements are significant because they improve the efficiency and effectiveness of computer vision systems, leading to better performance in various applications and a deeper understanding of how these models function.
Papers
An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Jin Gao, Shubo Lin, Shaoru Wang, Yutong Kou, Zeming Li, Liang Li, Congxuan Zhang, Xiaoqin Zhang, Yizheng Wang, Weiming Hu
How to Benchmark Vision Foundation Models for Semantic Segmentation?
Tommie Kerssies, Daan de Geus, Gijs Dubbelman