Image Modeling
Image modeling aims to learn efficient representations of images, enabling tasks like image generation, recognition, and manipulation. Current research focuses on self-supervised learning techniques, particularly masked image modeling (MIM), which trains models to reconstruct missing image parts, and on improving the interpretability and robustness of these models through methods like generalized integrated gradients. These advancements are significant because they improve the efficiency and effectiveness of computer vision systems, leading to better performance in various applications and a deeper understanding of how these models function.
Papers
CAE v2: Context Autoencoder with CLIP Target
Xinyu Zhang, Jiahui Chen, Junkun Yuan, Qiang Chen, Jian Wang, Xiaodi Wang, Shumin Han, Xiaokang Chen, Jimin Pi, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang
GLAMI-1M: A Multilingual Image-Text Fashion Dataset
Vaclav Kosar, Antonín Hoskovec, Milan Šulc, Radek Bartyzal