Egocentric Perception

Egocentric perception focuses on understanding the world from a first-person perspective, primarily using data from wearable cameras and other sensors. Current research emphasizes developing robust models for tasks like action recognition, 3D scene reconstruction, and trajectory prediction, often employing transformer architectures and contrastive learning methods to integrate multimodal data (video, audio, IMU, etc.). This field is crucial for advancing applications in augmented reality, assistive technologies for the visually impaired, and human-robot interaction, as it enables more natural and intuitive interactions between humans and machines. The development of large-scale, richly annotated datasets is driving progress in this rapidly evolving area.

Papers