Realistic Video Portrait

Realistic video portrait generation aims to create lifelike talking-head videos from audio input or other control signals, focusing on achieving high fidelity and expressiveness. Current research emphasizes methods leveraging neural radiance fields (NeRFs) and diffusion models, often incorporating attention mechanisms to better correlate audio cues with nuanced facial movements and other details like head pose and eye blinks. These advancements are driving progress in applications such as virtual reality, video conferencing, and film production, by enabling more natural and engaging human-computer interaction and digital content creation.

Papers