Paper ID: 2403.14368
Enabling Visual Composition and Animation in Unsupervised Video Generation
Aram Davtyan, Sepehr Sameni, Björn Ommer, Paolo Favaro
In this work we propose a novel method for unsupervised controllable video generation. Once trained on a dataset of unannotated videos, at inference our model is capable of both composing scenes of predefined object parts and animating them in a plausible and controlled way. This is achieved by conditioning video generation on a randomly selected subset of local pre-trained self-supervised features during training. We call our model CAGE for visual Composition and Animation for video GEneration. We conduct a series of experiments to demonstrate capabilities of CAGE in various settings. Project website: https://araachie.github.io/cage.
Submitted: Mar 21, 2024