Video Text Retrieval
Video text retrieval (VTR) aims to find videos that best match given text queries, bridging the semantic gap between visual and textual data. Current research heavily utilizes pre-trained vision-language models like CLIP, focusing on improving efficiency through techniques such as prompt tuning and adapter modules, as well as enhancing accuracy via multi-scale feature learning, refined cross-modal alignment strategies (e.g., one-to-many alignment), and data-centric approaches like query expansion. VTR is crucial for applications like video search and recommendation, and ongoing research is improving both the speed and accuracy of these systems.
Papers
January 26, 2023
January 19, 2023
December 31, 2022
December 2, 2022
November 30, 2022
November 17, 2022
October 19, 2022
October 13, 2022
October 10, 2022
September 27, 2022
August 16, 2022
August 8, 2022
August 3, 2022
July 16, 2022
July 15, 2022
July 11, 2022
May 2, 2022
April 26, 2022
April 7, 2022