Visual Relationship
Visual relationship detection aims to understand the interactions between objects within images and videos, going beyond simple object recognition to capture the semantic relationships between them. Current research heavily utilizes transformer-based architectures, often incorporating vision-language models like CLIP to improve open-vocabulary capabilities and handle complex spatial and temporal relationships. This field is crucial for advancing artificial intelligence in robotics, scene understanding, and other applications requiring nuanced interpretation of visual data, with recent work focusing on improving efficiency, accuracy, and generalization across diverse datasets.
Papers
September 19, 2024
September 3, 2024
August 15, 2024
May 6, 2024
March 26, 2024
March 21, 2024
March 6, 2024
January 1, 2024
November 8, 2023
September 13, 2023
July 10, 2023
June 9, 2023
May 22, 2023
March 16, 2023
October 19, 2022
August 23, 2022
August 8, 2022
June 18, 2022
May 2, 2022