Neural Diarization
Neural diarization aims to automatically identify who spoke when in an audio recording, a crucial task for applications like meeting transcription and analysis. Current research heavily focuses on end-to-end neural models, often employing encoder-decoder architectures with attractors to represent speakers, and exploring techniques like self-supervised learning and powerset multi-class formulations to improve accuracy and robustness, particularly in handling overlapping speech and varying numbers of speakers. These advancements are significantly impacting fields requiring automated speaker segmentation, leading to more efficient and accurate processing of audio data in various applications.
Papers
October 9, 2024
September 24, 2024
September 14, 2024
June 27, 2024
June 26, 2024
June 24, 2024
March 21, 2024
February 29, 2024
December 11, 2023
December 7, 2023
November 15, 2023
October 19, 2023
September 25, 2023
September 22, 2023
September 17, 2023
September 15, 2023
September 13, 2023
June 24, 2023
May 29, 2023