Valence Arousal Estimation
Valence-arousal estimation aims to automatically assess the emotional state of individuals by quantifying their valence (positivity/negativity) and arousal (activation/deactivation) levels. Current research heavily utilizes deep learning models, often incorporating multimodal data (visual, audio, physiological signals) and advanced architectures like convolutional neural networks (CNNs), transformers, and recurrent neural networks (RNNs) for improved accuracy and robustness, particularly in challenging "in-the-wild" settings. This field is crucial for advancing human-computer interaction, clinical diagnostics (e.g., sleep disorder detection), and other applications requiring real-time emotion understanding, with ongoing efforts focused on improving model fairness, generalizability, and efficiency.
Papers
SUN Team's Contribution to ABAW 2024 Competition: Audio-visual Valence-Arousal Estimation and Expression Recognition
Denis Dresvyanskiy, Maxim Markitantov, Jiawei Yu, Peitong Li, Heysem Kaya, Alexey Karpov
Multimodal Fusion Method with Spatiotemporal Sequences and Relationship Learning for Valence-Arousal Estimation
Jun Yu, Gongpeng Zhao, Yongqi Wang, Zhihong Wei, Yang Zheng, Zerui Zhang, Zhongpeng Cai, Guochen Xie, Jichao Zhu, Wangyuan Zhu