Action Segmentation
Action segmentation aims to automatically divide videos into temporally contiguous segments, each corresponding to a distinct action. Current research heavily utilizes transformer-based architectures, often incorporating techniques like attention mechanisms and efficient feature encoding to improve accuracy and reduce computational cost, particularly for long videos. This field is crucial for applications ranging from video understanding and human-robot interaction to automated analysis of animal behavior and surgical procedures, driving advancements in both algorithmic efficiency and the development of new datasets for evaluation.
Papers
Action Segmentation Using 2D Skeleton Heatmaps and Multi-Modality Fusion
Syed Waleed Hyder, Muhammad Usama, Anas Zafar, Muhammad Naufil, Fawad Javed Fateh, Andrey Konin, M. Zeeshan Zia, Quoc-Huy Tran
OTAS: Unsupervised Boundary Detection for Object-Centric Temporal Action Segmentation
Yuerong Li, Zhengrong Xue, Huazhe Xu