Egocentric Vision

Egocentric vision focuses on understanding the world from a first-person perspective, primarily using data from head-mounted cameras. Current research emphasizes developing robust models for tasks like action recognition, object interaction prediction, and 3D pose estimation, often employing deep learning architectures such as recurrent neural networks and vision-language models. These advancements are driving progress in areas such as human-robot interaction, assistive technologies (e.g., stroke rehabilitation), and augmented/virtual reality, by enabling more natural and intuitive interfaces. The creation of large-scale, richly annotated egocentric datasets is also a key focus, facilitating the development and benchmarking of new algorithms.

Papers