Face Voice

Face-voice association research focuses on establishing robust links between a person's facial features and their voice, aiming to improve multimodal biometric systems and applications like speaker identification and virtual human creation. Current research emphasizes developing sophisticated models, often employing contrastive learning, multimodal encoders, and techniques like fusion and orthogonal projection, to handle challenges such as multilingual speech and limited data. These advancements are significant for improving the accuracy and efficiency of audio-visual systems across diverse applications, including security, entertainment, and accessibility technologies.

Papers