Next Active Object

Next Active Object (NAO) prediction focuses on anticipating the object a person will interact with next in egocentric videos, a crucial step in understanding human-object interactions. Current research employs transformer-based architectures, often incorporating multi-modal data and guided attention mechanisms to improve accuracy in predicting both the object's identity and its future location and the timing of the interaction. This research is significant for advancing computer vision and robotics, with applications ranging from improving human-robot interaction to creating more context-aware assistive technologies and enhancing virtual and augmented reality experiences.

Papers