Zero Shot Video
Zero-shot video recognition aims to classify videos into categories never seen during model training, leveraging the power of pre-trained vision-language models (VLMs) and multimodal data. Current research focuses on improving the accuracy of these models by incorporating temporal information effectively, developing novel architectures like those based on CLIP, and employing techniques such as interpolated weight optimization and cross-modal attention to better align visual and textual representations. These advancements hold significant promise for applications requiring robust video understanding in scenarios with limited labeled data, such as environmental monitoring and automated content analysis.
Papers
October 8, 2023
September 18, 2023
August 14, 2023
July 20, 2023
February 1, 2023
July 15, 2022
March 29, 2022