Video Dataset
Video datasets are crucial for training and evaluating computer vision models capable of understanding video content, encompassing diverse tasks like action recognition, object tracking, and quality assessment. Current research emphasizes creating benchmarks with varied video sources (e.g., natural scenes, AI-generated content), incorporating multimodal information (text, audio), and focusing on challenging scenarios such as unusual activity localization and camouflaged object segmentation. These advancements are driving progress in video understanding, with applications ranging from improved surveillance systems and e-commerce experiences to more sophisticated content moderation and conservation efforts.
Papers
Slovo: Russian Sign Language Dataset
Alexander Kapitanov, Karina Kvanchiani, Alexander Nagaev, Elizaveta Petrova
BigVideo: A Large-scale Video Subtitle Translation Dataset for Multimodal Machine Translation
Liyan Kang, Luyang Huang, Ningxin Peng, Peihao Zhu, Zewei Sun, Shanbo Cheng, Mingxuan Wang, Degen Huang, Jinsong Su