Face Generation
Face generation research aims to create realistic and controllable facial images and videos, often driven by audio or text input. Current efforts focus on improving the realism and expressiveness of generated faces, employing diverse model architectures such as diffusion models, neural radiance fields (NeRFs), and transformers, often incorporating techniques like disentangled representations and fine-grained control mechanisms (e.g., action units). This field is significant for its applications in entertainment, virtual reality, and communication technologies, while also presenting challenges and opportunities in areas like deepfake detection and ethical considerations surrounding synthetic media.
Papers
Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters
Steven Hogue, Chenxu Zhang, Yapeng Tian, Xiaohu Guo
GLCF: A Global-Local Multimodal Coherence Analysis Framework for Talking Face Generation Detection
Xiaocan Chen, Qilin Yin, Jiarui Liu, Wei Lu, Xiangyang Luo, Jiantao Zhou