Object Centric Representation
Object-centric representation aims to model scenes as compositions of individual objects and their relationships, enabling more robust and generalizable AI systems. Current research focuses on developing unsupervised learning methods, often employing transformer networks, slot attention mechanisms, and generative models (like NeRFs) to learn these representations from various data modalities (images, videos, point clouds). This approach promises significant improvements in tasks requiring compositional understanding, such as robotics, visual question answering, and scene prediction, by moving beyond pixel-level processing to a more human-like understanding of the world. The resulting disentangled representations also enhance interpretability and facilitate zero-shot generalization across diverse domains.
Papers
Parallelized Spatiotemporal Binding
Gautam Singh, Yue Wang, Jiawei Yang, Boris Ivanovic, Sungjin Ahn, Marco Pavone, Tong Che
DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer
Yizhe Wu, Haitz Sáez de Ocáriz Borde, Jack Collins, Oiwi Parker Jones, Ingmar Posner