Video Annotation
Video annotation, the process of labeling video data for machine learning, aims to create high-quality training datasets for various video understanding tasks. Current research focuses on improving annotation efficiency through techniques like dimensionality reduction, active learning, and leveraging large language models (LLMs) to generate or refine annotations, often using semi-supervised or weakly-supervised approaches to reduce the need for extensive manual labeling. These advancements are crucial for enabling the development of more accurate and robust video analysis models, impacting fields ranging from medical diagnosis to autonomous driving.
Papers
Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization
Geuntaek Lim, Hyunwoo Kim, Joonsoo Kim, Yukyung Choi
Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts
Peng Wu, Xuerong Zhou, Guansong Pang, Zhiwei Yang, Qingsen Yan, Peng Wang, Yanning Zhang