Paper ID: 2111.03106

Skeleton-Split Framework using Spatial Temporal Graph Convolutional Networks for Action Recogntion

Motasem Alsawadi, Miguel Rio

There has been a dramatic increase in the volume of videos and their related content uploaded to the internet. Accordingly, the need for efficient algorithms to analyse this vast amount of data has attracted significant research interest. An action recognition system based upon human body motions has been proven to interpret videos contents accurately. This work aims to recognize activities of daily living using the ST-GCN model, providing a comparison between four different partitioning strategies: spatial configuration partitioning, full distance split, connection split, and index split. To achieve this aim, we present the first implementation of the ST-GCN framework upon the HMDB-51 dataset. We have achieved 48.88 % top-1 accuracy by using the connection split partitioning approach. Through experimental simulation, we show that our proposals have achieved the highest accuracy performance on the UCF-101 dataset using the ST-GCN framework than the state-of-the-art approach. Finally, accuracy of 73.25 % top-1 is achieved by using the index split partitioning strategy.

Submitted: Nov 4, 2021