Paper ID: 2301.06082

A Survey on Human Action Recognition

Zhou Shuchang

Human Action Recognition (HAR), one of the most important tasks in computer vision, has developed rapidly in the past decade and has a wide range of applications in health monitoring, intelligent surveillance, virtual reality, human computer interaction and so on. Human actions can be represented by a wide variety of modalities, such as RGB-D cameras, audio, inertial sensors,etc. Consequently, in addition to the mainstream single modality based HAR approaches, more and more research is devoted to the multimodal domain due to the complementary properties between multimodal data. In this paper, we present a survey of HAR methods in recent years according to the different input modalities. Meanwhile, considering that most of the recent surveys on HAR focus on the third perspective, while this survey aims to provide a more comprehensive introduction to HAR novices and researchers, we therefore also investigate the actions recognition methods from the first perspective in recent years. Finally, we give a brief introduction about the benchmark HAR datasets and show the performance comparison of different methods on these datasets.

Submitted: Dec 20, 2022