Text Supervision
Text supervision leverages textual information, such as descriptions or reports, to guide the training of computer vision models, particularly in scenarios with limited or expensive labeled image data. Current research focuses on integrating text supervision into vision-language models (VLMs) like CLIP, employing techniques like prompt learning, knowledge distillation, and contrastive learning to improve model performance on tasks such as image classification, segmentation, and object detection. This approach offers a cost-effective and efficient way to enhance model accuracy and generalization, particularly beneficial in domains like medical imaging and open-vocabulary tasks where labeled data is scarce or expensive to obtain.
Papers
September 18, 2024
August 29, 2024
May 23, 2024
May 1, 2024
March 8, 2024
March 6, 2024
January 19, 2024
December 11, 2023
October 23, 2023
August 29, 2023
August 28, 2023
August 17, 2023
August 15, 2023
May 31, 2023
December 5, 2022
November 11, 2022
July 12, 2022
May 14, 2022
February 26, 2022