Action Localization
Action localization in videos aims to identify both the class and temporal extent of actions within untrimmed video sequences. Current research emphasizes robust methods for handling multiple actions, noisy data, and limited annotations, often employing transformer-based architectures, multimodal approaches (combining visual and textual information), and self-supervised or weakly-supervised learning techniques to improve accuracy and efficiency. This field is crucial for applications ranging from video understanding and content analysis to robotics and assistive technologies, driving advancements in both model design and dataset creation.
Papers
IMUVIE: Pickup Timeline Action Localization via Motion Movies
John Clapham, Kenneth Koltermann, Yanfu Zhang, Yuming Sun, Evie N Burnet, Gang Zhou
Rethinking Top Probability from Multi-view for Distracted Driver Behaviour Localization
Quang Vinh Nguyen, Vo Hoang Thanh Son, Chau Truong Vinh Hoang, Duc Duy Nguyen, Nhat Huy Nguyen Minh, Soo-Hyung Kim