Object Level Representation
Object-level representation in computer vision aims to represent scenes not as pixel grids but as collections of individual objects, each with its own features, enabling more robust and interpretable AI systems. Current research focuses on developing models that learn these representations effectively, often employing transformer architectures, variational autoencoders, and contrastive learning methods, with a strong emphasis on handling objects of varying scales and incorporating both visual and textual information. This research is crucial for advancing applications such as multi-object tracking, scene synthesis, and robotic manipulation, by enabling more accurate and generalizable perception and reasoning capabilities.
Papers
October 29, 2024
October 2, 2024
June 13, 2024
May 17, 2024
March 21, 2024
March 3, 2024
November 14, 2023
September 7, 2023
August 1, 2023
May 22, 2023
October 25, 2022