Visual Relation
Visual relation understanding in computer vision aims to enable machines to comprehend the relationships between objects within images and videos, mirroring human visual perception. Current research focuses on improving the accuracy and efficiency of visual relation detection and generation using various deep learning architectures, including transformers, graph neural networks, and diffusion models, often incorporating techniques like active perception and knowledge graphs to enhance performance. This field is crucial for advancing artificial intelligence, with applications ranging from scene understanding and image captioning to more complex tasks like robotic manipulation and medical image analysis.
Papers
RelationBooth: Towards Relation-Aware Customized Object Generation
Qingyu Shi, Lu Qi, Jianzong Wu, Jinbin Bai, Jingbo Wang, Yunhai Tong, Xiangtai Li, Ming-Husang Yang
CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP
Tianyu Yang, Lisen Dai, Zheyuan Liu, Xiangqi Wang, Meng Jiang, Yapeng Tian, Xiangliang Zhang