Visual in Context Learning
Visual in-context learning (VICL) aims to enable computer vision models to perform diverse tasks using only a few example images and associated textual descriptions, without requiring extensive retraining. Current research focuses on improving efficiency and accuracy through techniques like prompt selection algorithms, multimodal model architectures (e.g., incorporating transformers and vision-language models), and novel methods for fusing visual and textual information. This approach holds significant promise for reducing the need for large labeled datasets in computer vision, thereby accelerating progress in various applications, including image restoration, segmentation, and captioning.
Papers
January 12, 2024
December 11, 2023
November 7, 2023
October 23, 2023
September 28, 2023
May 26, 2023
May 24, 2023
April 10, 2023
January 31, 2023
January 20, 2023
December 5, 2022
April 14, 2022