Dense Video Captioning
Dense video captioning aims to automatically generate detailed, temporally localized descriptions of events within untrimmed videos. Current research emphasizes improving the accuracy and efficiency of caption generation, particularly focusing on online (real-time) captioning and leveraging large language models (LLMs) and pre-trained vision-language models for efficient adaptation to video data. This field is significant for advancing video understanding and has applications in areas such as accessibility, video summarization, and automated content analysis, driving progress in both computer vision and natural language processing.
Papers
October 31, 2024
October 14, 2024
June 20, 2024
April 25, 2024
April 12, 2024
April 11, 2024
April 3, 2024
April 1, 2024
March 26, 2024
November 30, 2023
November 28, 2023
November 5, 2023
July 5, 2023
June 20, 2023
April 22, 2023
April 10, 2023
February 27, 2023
January 15, 2023
November 5, 2022