Speech Separation
Speech separation aims to isolate individual voices from a mixture of sounds, a crucial task for applications like hearing aids and voice assistants. Current research emphasizes developing efficient and robust models, focusing on architectures like Transformers and state-space models (e.g., Mamba) to handle complex acoustic environments (noise, reverberation, moving sources) and varying numbers of speakers. This involves creating large, realistic datasets, incorporating visual cues (audio-visual models), and exploring techniques like unsupervised learning and efficient model compression to improve performance and reduce computational demands for real-time applications. Advances in this field directly impact the development of more effective and user-friendly speech technologies.
Papers
Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech
Rohit Paturi, Sundararajan Srinivasan, Katrin Kirchhoff, Daniel Garcia-Romero
DEBACER: a method for slicing moderated debates
Thomas Palmeira Ferraz, Alexandre Alcoforado, Enzo Bustos, André Seidel Oliveira, Rodrigo Gerber, Naíde Müller, André Corrêa d'Almeida, Bruno Miguel Veloso, Anna Helena Reali Costa