Visual Embeddings
Visual embeddings represent images and videos as numerical vectors, aiming to capture their semantic content for various downstream tasks like image classification, video retrieval, and question answering. Current research focuses on improving the quality and robustness of these embeddings, often leveraging large language models (LLMs) and techniques like prompt learning, contrastive learning, and multi-modal fusion to better align visual and textual information. This work is significant because effective visual embeddings are crucial for enabling advanced AI applications that require understanding and reasoning about visual data, impacting fields ranging from computer vision to natural language processing.
Papers
November 7, 2024
October 21, 2024
September 23, 2024
July 29, 2024
July 16, 2024
June 17, 2024
March 25, 2024
March 21, 2024
March 19, 2024
March 17, 2024
March 15, 2024
February 19, 2024
January 16, 2024
January 1, 2024
December 14, 2023
December 7, 2023
December 4, 2023
December 1, 2023
November 18, 2023
November 13, 2023