Diarization Performance
Speaker diarization, the task of identifying who spoke when in a conversation, aims to improve the accuracy and efficiency of audio analysis. Current research focuses on refining end-to-end neural diarization models, often incorporating techniques like vector clustering and attractors, as well as exploring generative methods and the integration of speech separation and voice activity detection. Improving diarization accuracy is crucial for downstream applications such as speech recognition, and recent work highlights the potential of large language models for post-processing correction and the use of multimodal data (e.g., video) to enhance performance.
Papers
September 7, 2024
June 24, 2024
June 12, 2024
June 7, 2024
February 29, 2024
September 22, 2023
September 19, 2023
July 28, 2023
May 29, 2023
March 21, 2023
March 13, 2023
November 8, 2022
October 26, 2022
July 12, 2022
March 30, 2022