Visual in Context Learning
Visual in-context learning (VICL) aims to enable computer vision models to perform diverse tasks using only a few example images and associated textual descriptions, without requiring extensive retraining. Current research focuses on improving efficiency and accuracy through techniques like prompt selection algorithms, multimodal model architectures (e.g., incorporating transformers and vision-language models), and novel methods for fusing visual and textual information. This approach holds significant promise for reducing the need for large labeled datasets in computer vision, thereby accelerating progress in various applications, including image restoration, segmentation, and captioning.
Papers
December 3, 2024
November 22, 2024
October 17, 2024
September 2, 2024
August 30, 2024
August 14, 2024
July 25, 2024
July 19, 2024
July 10, 2024
June 25, 2024
June 4, 2024
May 24, 2024
May 16, 2024
March 26, 2024
March 22, 2024
February 28, 2024
February 22, 2024
February 18, 2024