Egocentric Action
Egocentric action research focuses on understanding human actions from a first-person perspective, aiming to automatically interpret activities of daily living captured by wearable cameras. Current efforts concentrate on improving action recognition accuracy using various techniques, including transformer-based architectures, 2D and 3D hand pose estimation, and the generation of synthetic egocentric action frames via visual instruction tuning. This field is significant for advancing human-computer interaction, enabling applications like assistive technologies and more intuitive interfaces, and for providing novel representations of egocentric video data that facilitate long-form understanding.
Papers
LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning
Bolin Lai, Xiaoliang Dai, Lawrence Chen, Guan Pang, James M. Rehg, Miao Liu
Action Scene Graphs for Long-Form Understanding of Egocentric Videos
Ivan Rodin, Antonino Furnari, Kyle Min, Subarna Tripathi, Giovanni Maria Farinella