Temporal Video Grounding
Temporal video grounding (TVG) focuses on precisely locating the time segment in an untrimmed video that corresponds to a given textual description. Current research emphasizes improving model accuracy and efficiency, exploring techniques like multi-modal learning (integrating vision and language), spiking neural networks for efficient saliency detection, and leveraging pre-trained language models for enhanced query understanding. These advancements are crucial for applications requiring fine-grained video understanding, such as video summarization, content retrieval, and human-computer interaction.
Papers
November 12, 2024
June 11, 2024
April 1, 2024
December 21, 2023
November 30, 2023
July 20, 2023
September 26, 2022
July 6, 2022
April 12, 2022