Cross Video
Cross-video research focuses on leveraging information across multiple videos to improve various video understanding tasks. Current efforts concentrate on developing models that effectively capture both temporal and semantic information within and between videos, often employing transformer-based architectures and self-supervised learning techniques to enhance representation learning and cross-modal alignment. This work is significant because it addresses limitations of single-video analysis, leading to improved performance in applications such as video retrieval, question answering, and action localization, ultimately advancing the field of computer vision.
Papers
October 18, 2024
October 17, 2024
September 27, 2024
September 13, 2024
August 28, 2024
June 7, 2024
May 21, 2024
May 17, 2024
April 25, 2024
April 11, 2024
October 12, 2023
August 24, 2023
August 15, 2023
August 9, 2023
October 11, 2022
August 12, 2022
June 29, 2022
May 1, 2022
December 27, 2021