Video Generation
Video generation research focuses on creating realistic and controllable videos from various inputs like text, images, or other videos. Current efforts center on improving model architectures, such as diffusion models and diffusion transformers, to enhance video quality, temporal consistency, and controllability, often incorporating techniques like vector quantization for efficiency. This field is crucial for advancing multimedia applications, including content creation, simulation, and autonomous driving, by providing tools to generate high-quality, diverse, and easily manipulated video data. Furthermore, ongoing research is addressing the limitations of existing evaluation metrics to better align assessments with human perception.
Papers
Training-free Long Video Generation with Chain of Diffusion Model Experts
Wenhao Li, Yichao Cao, Xiu Su, Xi Lin, Shan You, Mingkai Zheng, Yi Chen, Chang Xu
TVG: A Training-free Transition Video Generation Method with Diffusion Models
Rui Zhang, Yaosen Chen, Yuegen Liu, Wei Wang, Xuming Wen, Hongxia Wang
CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
Tao Wu, Yong Zhang, Xintao Wang, Xianpan Zhou, Guangcong Zheng, Zhongang Qi, Ying Shan, Xi Li
EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation
Cong Wang, Jiaxi Gu, Panwen Hu, Haoyu Zhao, Yuanfan Guo, Jianhua Han, Hang Xu, Xiaodan Liang
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations
Xiaochuang Han, Marjan Ghazvininejad, Pang Wei Koh, Yulia Tsvetkov
FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Jiasong Feng, Ao Ma, Jing Wang, Bo Cheng, Xiaodan Liang, Dawei Leng, Yuhui Yin