Paper ID: 2406.11079

GANmut: Generating and Modifying Facial Expressions

Maria Surani

In the realm of emotion synthesis, the ability to create authentic and nuanced facial expressions continues to gain importance. The GANmut study discusses a recently introduced advanced GAN framework that, instead of relying on predefined labels, learns a dynamic and interpretable emotion space. This methodology maps each discrete emotion as vectors starting from a neutral state, their magnitude reflecting the emotion's intensity. The current project aims to extend the study of this framework by benchmarking across various datasets, image resolutions, and facial detection methodologies. This will involve conducting a series of experiments using two emotional datasets: Aff-Wild2 and AffNet. Aff-Wild2 contains videos captured in uncontrolled environments, which include diverse camera angles, head positions, and lighting conditions, providing a real-world challenge. AffNet offers images with labelled emotions, improving the diversity of emotional expressions available for training. The first two experiments will focus on training GANmut using the Aff-Wild2 dataset, processed with either RetinaFace or MTCNN, both of which are high-performance deep learning face detectors. This setup will help determine how well GANmut can learn to synthesise emotions under challenging conditions and assess the comparative effectiveness of these face detection technologies. The subsequent two experiments will merge the Aff-Wild2 and AffNet datasets, combining the real world variability of Aff-Wild2 with the diverse emotional labels of AffNet. The same face detectors, RetinaFace and MTCNN, will be employed to evaluate whether the enhanced diversity of the combined datasets improves GANmut's performance and to compare the impact of each face detection method in this hybrid setup.

Submitted: Jun 16, 2024