Large Vision Model
Large vision models (LVMs) are deep learning systems designed to process and understand visual information, aiming to achieve human-level performance on diverse computer vision tasks. Current research focuses on improving LVM efficiency through techniques like progressive learning and parameter-efficient fine-tuning, as well as exploring their application in various domains, including autonomous driving, medical image analysis, and agriculture, often leveraging architectures such as Vision Transformers and diffusion models. The development of LVMs is significantly impacting the field by enabling advancements in tasks requiring complex visual reasoning and reducing the need for extensive labeled data through techniques like in-context learning and zero-shot capabilities.
Papers
HyperDet: Generalizable Detection of Synthesized Images by Generating and Merging A Mixture of Hyper LoRAs
Huangsen Cao, Yongwei Wang, Yinfeng Liu, Sixian Zheng, Kangtao Lv, Zhimeng Zhang, Bo Zhang, Xin Ding, Fei Wu
Hyper Adversarial Tuning for Boosting Adversarial Robustness of Pretrained Large Vision Models
Kangtao Lv, Huangsen Cao, Kainan Tu, Yihuai Xu, Zhimeng Zhang, Xin Ding, Yongwei Wang