Skeleton Based Action Segmentation

Skeleton-based action segmentation aims to automatically identify and locate different actions within a video sequence using only the skeletal data extracted from the video. Current research focuses on improving the accuracy and efficiency of this process, exploring model architectures such as graph convolutional networks (GCNs) and temporal convolutional networks (TCNs), often incorporating multi-stage processing and innovative approaches like latent action composition or multi-modality fusion with RGB video data. These advancements are significant for applications in human activity recognition and analysis, offering improved robustness to noisy data and enabling more nuanced understanding of complex human movements.

Papers