Speaker Diarization
Speaker diarization is the task of identifying "who spoke when" in an audio recording, a crucial preprocessing step for many speech applications. Current research focuses on improving accuracy and efficiency, particularly in challenging scenarios like multi-speaker conversations and noisy environments, using techniques such as end-to-end neural networks, spectral clustering, and the integration of audio-visual or semantic information. These advancements are driving progress in areas like meeting transcription, multilingual speech processing, and improving the performance of downstream tasks such as automatic speech recognition.
Papers
October 16, 2024
October 15, 2024
October 9, 2024
September 25, 2024
September 24, 2024
September 16, 2024
September 14, 2024
September 13, 2024
September 10, 2024
September 9, 2024
September 7, 2024
September 1, 2024
August 30, 2024
August 22, 2024
August 5, 2024
July 21, 2024
July 5, 2024
July 1, 2024