Ego4D Dataset
The Ego4D dataset is a large-scale collection of egocentric (first-person) videos designed to advance research in understanding human activity and interaction. Current research focuses on developing robust methods for 3D scene reconstruction, multimodal data fusion (combining video, audio, IMU data), and accurate pose estimation and tracking of humans and objects within these complex, dynamic scenes, often employing transformer-based architectures and novel training strategies. This resource is significantly impacting computer vision research by providing a benchmark for evaluating algorithms in challenging real-world scenarios, with applications ranging from robotics and augmented reality to human-computer interaction.
Papers
Egocentric Video-Language Pretraining @ Ego4D Challenge 2022
Kevin Qinghong Lin, Alex Jinpeng Wang, Mattia Soldan, Michael Wray, Rui Yan, Eric Zhongcong Xu, Difei Gao, Rongcheng Tu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Dima Damen, Bernard Ghanem, Wei Liu, Mike Zheng Shou
Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022
Kevin Qinghong Lin, Alex Jinpeng Wang, Rui Yan, Eric Zhongcong Xu, Rongcheng Tu, Yanru Zhu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Wei Liu, Mike Zheng Shou