Speech Emotion Recognition
Speech emotion recognition (SER) aims to automatically identify human emotions from speech, primarily focusing on improving accuracy and robustness across diverse languages and contexts. Current research emphasizes leveraging self-supervised learning models, particularly transformer-based architectures, and exploring techniques like cross-lingual adaptation, multi-modal fusion (combining speech with text or visual data), and efficient model compression for resource-constrained environments. Advances in SER have significant implications for various applications, including mental health monitoring, human-computer interaction, and personalized healthcare, by enabling more natural and empathetic interactions between humans and machines.
Papers
Attribute Inference Attack of Speech Emotion Recognition in Federated Learning Settings
Tiantian Feng, Hanieh Hashemi, Rajat Hebbar, Murali Annavaram, Shrikanth S. Narayanan
Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition
Ismail Shahin, Noor Hindawi, Ali Bou Nassif, Adi Alhudhaif, Kemal Polat