Scene Composition

Scene composition research focuses on generating and manipulating realistic visual scenes, often composed of multiple objects, with a primary objective of achieving greater control and realism in synthetic image and video generation. Current efforts leverage various deep learning architectures, including diffusion models, vision transformers, and neural radiance fields, often incorporating techniques like disentanglement, object-centric representations, and language guidance for improved scene control and editing. This work has significant implications for fields like computer graphics, animation, and robotics, enabling more efficient content creation, improved simulation environments, and advanced robotic manipulation capabilities.

Papers