Temporal Sentence
Temporal sentence grounding (TSG) focuses on locating the specific moment in an untrimmed video that corresponds to a given sentence. Current research emphasizes improving accuracy and efficiency, particularly in weakly supervised settings using glance annotations or limited training data, and explores various model architectures including graph memory networks, diffusion models, and attention mechanisms to better integrate visual and semantic information for more precise localization. Advances in TSG have significant implications for video understanding, enabling more robust and nuanced interactions with video content for applications such as video retrieval, summarization, and question answering.
Papers
July 7, 2024
April 21, 2024
August 8, 2023
March 2, 2023
February 21, 2023
January 5, 2023