Speech Emotion Recognition
Speech emotion recognition (SER) aims to automatically identify human emotions from speech, primarily focusing on improving accuracy and robustness across diverse languages and contexts. Current research emphasizes leveraging self-supervised learning models, particularly transformer-based architectures, and exploring techniques like cross-lingual adaptation, multi-modal fusion (combining speech with text or visual data), and efficient model compression for resource-constrained environments. Advances in SER have significant implications for various applications, including mental health monitoring, human-computer interaction, and personalized healthcare, by enabling more natural and empathetic interactions between humans and machines.
Papers
Enrolment-based personalisation for improving individual-level fairness in speech emotion recognition
Andreas Triantafyllopoulos, Björn Schuller
INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of Progress in Speech Emotion Recognition
Andreas Triantafyllopoulos, Anton Batliner, Simon Rampp, Manuel Milling, Björn Schuller
Multi-Microphone Speech Emotion Recognition using the Hierarchical Token-semantic Audio Transformer Architecture
Ohad Cohen, Gershon Hazan, Sharon Gannot
Dataset-Distillation Generative Model for Speech Emotion Recognition
Fabian Ritter-Gutierrez, Kuan-Po Huang, Jeremy H. M Wong, Dianwen Ng, Hung-yi Lee, Nancy F. Chen, Eng Siong Chng
Iterative Feature Boosting for Explainable Speech Emotion Recognition
Alaa Nfissi, Wassim Bouachir, Nizar Bouguila, Brian Mishara
1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem
Mingjie Chen, Hezhao Zhang, Yuanchao Li, Jiachen Luo, Wen Wu, Ziyang Ma, Peter Bell, Catherine Lai, Joshua Reiss, Lin Wang, Philip C. Woodland, Xie Chen, Huy Phan, Thomas Hain
Adapting WavLM for Speech Emotion Recognition
Daria Diatlova, Anton Udalov, Vitalii Shutov, Egor Spirin
Fine-grained Speech Sentiment Analysis in Chinese Psychological Support Hotlines Based on Large-scale Pre-trained Model
Zhonglong Chen, Changwei Song, Yining Chen, Jianqiang Li, Guanghui Fu, Yongsheng Tong, Qing Zhao