Language Diarization
Language diarization (LD) aims to automatically identify the spoken language(s) and their temporal boundaries within a multi-speaker conversation, often in multilingual settings. Current research focuses on improving LD accuracy using techniques like spectral clustering, self-supervised learning with architectures such as WavLM, and integrating LD with speaker diarization and speech recognition systems, often employing implicit language modeling to handle low-resource languages. These advancements are crucial for improving the performance of speech technologies in diverse, real-world scenarios, such as multilingual transcription and cross-lingual information retrieval. The development of robust LD systems is vital for bridging the language gap in increasingly globalized communication.
Papers
The Second DISPLACE Challenge : DIarization of SPeaker and LAnguage in Conversational Environments
Shareef Babu Kalluri, Prachi Singh, Pratik Roy Chowdhuri, Apoorva Kulkarni, Shikha Baghel, Pradyoth Hegde, Swapnil Sontakke, Deepak K T, S. R. Mahadeva Prasanna, Deepu Vijayasenan, Sriram Ganapathy
Exploring Spoken Language Identification Strategies for Automatic Transcription of Multilingual Broadcast and Institutional Speech
Martina Valente, Fabio Brugnara, Giovanni Morrone, Enrico Zovato, Leonardo Badino