Action Recognition
Action recognition, the task of automatically identifying actions within video data, aims to develop robust and efficient systems for understanding human and animal behavior. Current research focuses on improving accuracy and efficiency across diverse scenarios, employing various model architectures such as transformers, convolutional neural networks, and recurrent neural networks, often incorporating multimodal data (RGB, depth, skeleton, audio) and self-supervised learning techniques. This field is crucial for numerous applications, including autonomous systems, healthcare monitoring, and video surveillance, with ongoing efforts to address challenges like domain generalization, few-shot learning, and adversarial robustness.
Papers
CoRec: An Easy Approach for Coordination Recognition
Qing Wang, Haojie Jia, Wenfei Song, Qi Li
Action Recognition in Video Recordings from Gynecologic Laparoscopy
Sahar Nasirihaghighi, Negin Ghamsarian, Daniela Stefanics, Klaus Schoeffmann, Heinrich Husslein
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei Huang, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray
GeoDeformer: Geometric Deformable Transformer for Action Recognition
Jinhui Ye, Jiaming Zhou, Hui Xiong, Junwei Liang
Action-slot: Visual Action-centric Representations for Multi-label Atomic Activity Recognition in Traffic Scenes
Chi-Hsi Kung, Shu-Wei Lu, Yi-Hsuan Tsai, Yi-Ting Chen
LEAP: LLM-Generation of Egocentric Action Programs
Eadom Dessalene, Michael Maynord, Cornelia Fermüller, Yiannis Aloimonos
Object-based (yet Class-agnostic) Video Domain Adaptation
Dantong Niu, Amir Bar, Roei Herzig, Trevor Darrell, Anna Rohrbach
GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap
Hyogun Lee, Kyungho Bae, Seong Jong Ha, Yumin Ko, Gyeong-Moon Park, Jinwoo Choi
Modality Mixer Exploiting Complementary Information for Multi-modal Action Recognition
Sumin Lee, Sangmin Woo, Muhammad Adi Nugroho, Changick Kim
Challenges in Video-Based Infant Action Recognition: A Critical Examination of the State of the Art
Elaheh Hatamimajoumerd, Pooria Daneshvar Kakhaki, Xiaofei Huang, Lingfei Luan, Somaieh Amraee, Sarah Ostadabbas
Learning Human Action Recognition Representations Without Real Humans
Howard Zhong, Samarth Mishra, Donghyun Kim, SouYoung Jin, Rameswar Panda, Hilde Kuehne, Leonid Karlinsky, Venkatesh Saligrama, Aude Oliva, Rogerio Feris
Semantic-aware Video Representation for Few-shot Action Recognition
Yutao Tang, Benjamin Bejar, Rene Vidal