Action Recognition
Action recognition, the task of automatically identifying actions within video data, aims to develop robust and efficient systems for understanding human and animal behavior. Current research focuses on improving accuracy and efficiency across diverse scenarios, employing various model architectures such as transformers, convolutional neural networks, and recurrent neural networks, often incorporating multimodal data (RGB, depth, skeleton, audio) and self-supervised learning techniques. This field is crucial for numerous applications, including autonomous systems, healthcare monitoring, and video surveillance, with ongoing efforts to address challenges like domain generalization, few-shot learning, and adversarial robustness.
Papers
MARINE: A Computer Vision Model for Detecting Rare Predator-Prey Interactions in Animal Videos
Zsófia Katona, Seyed Sahand Mohammadi Ziabari, Fatemeh Karimi Nejadasl
Harnessing Temporal Causality for Advanced Temporal Action Detection
Shuming Liu, Lin Sui, Chen-Lin Zhang, Fangzhou Mu, Chen Zhao, Bernard Ghanem
Can VLMs be used on videos for action recognition? LLMs are Visual Reasoning Coordinators
Harsh Lunia
Decoupled Prompt-Adapter Tuning for Continual Activity Recognition
Di Fu, Thanh Vinh Vo, Haozhe Ma, Tze-Yun Leong
A Comprehensive Review of Few-shot Action Recognition
Yuyang Wanyan, Xiaoshan Yang, Weiming Dong, Changsheng Xu
Pose-guided multi-task video transformer for driver action recognition
Ricardo Pizarro, Roberto Valle, Luis Miguel Bergasa, José M. Buenaposada, Luis Baumela
SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders
Sheng-Wei Li, Zi-Xiang Wei, Wei-Jie Chen, Yi-Hsin Yu, Chih-Yuan Yang, Jane Yung-jen Hsu
QuIIL at T3 challenge: Towards Automation in Life-Saving Intervention Procedures from First-Person View
Trinh T. L. Vuong, Doanh C. Bui, Jin Tae Kwak