Overlapped Speech Detection
Overlapped speech detection (OSD) aims to identify when multiple speakers are talking simultaneously, a crucial preprocessing step for applications like speaker diarization and meeting transcription. Current research focuses on improving OSD accuracy and robustness using various techniques, including attention-based channel combination algorithms, large-scale learning with conformer networks, and two-stage frameworks that incorporate overlap-aware post-processing. Advances in OSD are vital for enhancing the performance of speech processing systems in real-world scenarios characterized by multi-party conversations and noisy environments, leading to more accurate and efficient transcription and analysis of audio data.
Papers
June 5, 2024
February 13, 2024
August 11, 2023
June 7, 2023
March 8, 2023
December 10, 2022
November 11, 2022
September 24, 2022
September 9, 2022