End to End Neural Diarization
End-to-end neural diarization (EEND) aims to automatically segment and label audio recordings by speaker, directly predicting speaker identities without intermediate steps like clustering. Current research focuses on improving model architectures, such as those employing transformer networks, encoder-decoder attractors, and masked attention mechanisms, to enhance accuracy, particularly in handling overlapping speech and variable numbers of speakers. These advancements are significant because they streamline the diarization process, leading to more efficient and robust systems with applications in various fields, including meeting transcription, voice assistants, and forensic audio analysis.
Papers
July 1, 2024
March 21, 2024
February 29, 2024
January 23, 2024
December 11, 2023
December 7, 2023
October 19, 2023
September 15, 2023
September 13, 2023
June 24, 2023
November 12, 2022
October 7, 2022
August 27, 2022
April 24, 2022
April 2, 2022
March 31, 2022