Face Generation
Face generation research aims to create realistic and controllable facial images and videos, often driven by audio or text input. Current efforts focus on improving the realism and expressiveness of generated faces, employing diverse model architectures such as diffusion models, neural radiance fields (NeRFs), and transformers, often incorporating techniques like disentangled representations and fine-grained control mechanisms (e.g., action units). This field is significant for its applications in entertainment, virtual reality, and communication technologies, while also presenting challenges and opportunities in areas like deepfake detection and ethical considerations surrounding synthetic media.
Papers
Multimodal-driven Talking Face Generation via a Unified Diffusion-based Generator
Chao Xu, Shaoting Zhu, Junwei Zhu, Tianxin Huang, Jiangning Zhang, Ying Tai, Yong Liu
High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning
Chao Xu, Junwei Zhu, Jiangning Zhang, Yue Han, Wenqing Chu, Ying Tai, Chengjie Wang, Zhifeng Xie, Yong Liu