Group Activity Recognition
Group activity recognition (GAR) aims to automatically classify the collective actions of multiple individuals in videos, focusing on understanding their interactions and spatiotemporal relationships. Current research heavily utilizes deep learning models, particularly transformer-based architectures and graph convolutional networks (GCNs), often incorporating multiple input modalities like RGB video, skeletal data, and even textual descriptions of activities to improve accuracy. This field is significant for its applications in various domains, including sports analytics, video surveillance, and human-computer interaction, with ongoing efforts to improve robustness, efficiency, and the handling of weakly supervised or unreliable data.
Papers
July 12, 2022
April 5, 2022
December 11, 2021