Egocentric Video Understanding
Egocentric video understanding aims to computationally interpret videos recorded from a first-person perspective, mimicking human perception of actions, interactions, and environments. Current research heavily focuses on developing robust multimodal models, often leveraging transformer architectures and incorporating data from various modalities (e.g., RGB, depth, audio, IMU) to improve accuracy and efficiency in tasks like action recognition, question answering, and scene understanding. These advancements are significant for applications in assistive robotics, human-computer interaction, and the broader field of artificial intelligence, enabling more natural and intuitive interactions between humans and machines.
Papers
October 15, 2024
October 9, 2024
September 26, 2024
August 9, 2024
July 28, 2024
June 26, 2024
June 22, 2024
June 19, 2024
June 1, 2024
April 14, 2024
March 11, 2024
March 5, 2024
December 6, 2023
July 14, 2023
January 5, 2023
December 13, 2022
November 18, 2022
October 9, 2022
July 22, 2022