Natural Image
Natural images, encompassing photographs and other visual data from the real world, are a central focus in computer vision research, aiming to enable machines to understand and interact with visual information as humans do. Current research emphasizes developing robust models, often leveraging architectures like Vision Transformers and diffusion models, to address challenges such as object detection, segmentation, and scene understanding in complex, diverse imagery. This work is crucial for advancing applications ranging from medical image analysis and autonomous navigation to improved image generation and quality assessment, ultimately bridging the gap between human and machine perception.
Papers
Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning?
Cheng Han, Qifan Wang, Yiming Cui, Wenguan Wang, Lifu Huang, Siyuan Qi, Dongfang Liu
Small Language Model Meets with Reinforced Vision Vocabulary
Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, En Yu, Jianjian Sun, Chunrui Han, Xiangyu Zhang