Video Level
Video-level analysis focuses on understanding and interpreting video content using only video-level labels, avoiding the costly and time-consuming task of frame-by-frame annotation. Current research emphasizes weakly-supervised learning techniques, employing transformer-based architectures, graph convolutional networks, and attention mechanisms to improve the accuracy of tasks such as action localization, anomaly detection, and event parsing. This approach is significant because it enables the development of scalable and efficient video analysis systems across diverse applications, including surveillance, content moderation, and healthcare. The resulting advancements in weakly-supervised learning contribute to broader progress in computer vision and related fields.
Papers
On the Importance of Sign Labeling: The Hamburg Sign Language Notation System Case Study
Maria Ferlin, Sylwia Majchrowska, Marta Plantykow, Alicja Kwaśniwska, Agnieszka Mikołajczyk-Bareła, Milena Olech, Jakub Nalepa
Human-Scene Network: A Novel Baseline with Self-rectifying Loss for Weakly supervised Video Anomaly Detection
Snehashis Majhi, Rui Dai, Quan Kong, Lorenzo Garattoni, Gianpiero Francesca, Francois Bremond