Active Object

Active object research focuses on understanding and modeling objects undergoing dynamic changes due to interaction, primarily within egocentric vision contexts. Current work emphasizes leveraging diverse data modalities (visual, audio, textual) and employing advanced architectures like transformers and diffusion models to improve object detection, tracking, and anticipation of future interactions, often incorporating knowledge graphs or large language models for enhanced context understanding. This research is crucial for advancing human-robot interaction, improving scene understanding in robotics and augmented reality, and enabling more sophisticated AI agents capable of complex task execution.

Papers