Scene Graph

Scene graphs are structured representations of images and videos, depicting objects and their relationships, aiming to improve machine understanding of visual scenes. Current research focuses on enhancing scene graph generation using various techniques, including transformer-based models, graph neural networks, and the integration of large language models to improve accuracy and handle open-vocabulary objects and relationships. This work is significant for advancing computer vision, enabling improved applications in robotics (navigation, manipulation), autonomous driving, medical image analysis, and more generally, improving the ability of machines to understand and interact with complex visual environments.

Papers