Scene Graph
Scene graphs are structured representations of images and videos, depicting objects and their relationships, aiming to improve machine understanding of visual scenes. Current research focuses on enhancing scene graph generation using various techniques, including transformer-based models, graph neural networks, and the integration of large language models to improve accuracy and handle open-vocabulary objects and relationships. This work is significant for advancing computer vision, enabling improved applications in robotics (navigation, manipulation), autonomous driving, medical image analysis, and more generally, improving the ability of machines to understand and interact with complex visual environments.
Papers
An Actionable Hierarchical Scene Representation Enhancing Autonomous Inspection Missions in Unknown Environments
Vignesh Kottayam Viswanathan, Mario Alberto Valdes Saucedo, Sumeet Gajanan Satpute, Christoforos Kanellakis, George Nikolakopoulos
xFLIE: Leveraging Actionable Hierarchical Scene Representations for Autonomous Semantic-Aware Inspection Missions
Vignesh Kottayam Viswanathan, Mario A.V. Saucedo, Sumeet Gajanan Satpute, Christoforos Kanellakis, George Nikolakopoulos
Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning
Fan Lu, Wei Wu, Kecheng Zheng, Shuailei Ma, Biao Gong, Jiawei Liu, Wei Zhai, Yang Cao, Yujun Shen, Zheng-Jun Zha
LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural Annotations
Zejian Li, Chenye Meng, Yize Li, Ling Yang, Shengyuan Zhang, Jiarui Ma, Jiayi Li, Guang Yang, Changyuan Yang, Zhiyuan Yang, Jinxiong Chang, Lingyun Sun
Generate Any Scene: Evaluating and Improving Text-to-Vision Generation with Scene Graph Programming
Ziqi Gao, Weikai Huang, Jieyu Zhang, Aniruddha Kembhavi, Ranjay Krishna