Text to Video Generation
Text-to-video generation aims to create videos from textual descriptions, bridging the gap between human language and visual media. Current research heavily utilizes diffusion models, often incorporating 3D U-Nets or transformer architectures, and focuses on improving video quality, temporal consistency, controllability (including camera movement and object manipulation), and compositional capabilities—the ability to synthesize videos with multiple interacting elements. These advancements hold significant implications for various fields, including film production, animation, and virtual reality, by automating video creation and enabling more precise control over generated content.
Papers
November 15, 2024
October 31, 2024
October 8, 2024
September 24, 2024
September 23, 2024
September 13, 2024
August 31, 2024
August 19, 2024
July 19, 2024
July 9, 2024
July 8, 2024
July 2, 2024
July 1, 2024
June 25, 2024
June 20, 2024
June 10, 2024
June 8, 2024
June 6, 2024
June 4, 2024
May 28, 2024