Text to Video Generation
Text-to-video generation aims to create videos from textual descriptions, bridging the gap between human language and visual media. Current research heavily utilizes diffusion models, often incorporating 3D U-Nets or transformer architectures, and focuses on improving video quality, temporal consistency, controllability (including camera movement and object manipulation), and compositional capabilities—the ability to synthesize videos with multiple interacting elements. These advancements hold significant implications for various fields, including film production, animation, and virtual reality, by automating video creation and enabling more precise control over generated content.
Papers
August 16, 2023
August 12, 2023
June 5, 2023
May 23, 2023
May 22, 2023
May 18, 2023
May 4, 2023
April 17, 2023
April 3, 2023
March 29, 2023
March 23, 2023
December 22, 2022
November 23, 2022
November 20, 2022
September 29, 2022
June 7, 2022
May 29, 2022