Multi Talker
Multi-talker speech recognition (MTASR) focuses on accurately transcribing speech containing overlapping voices, a significant challenge in automatic speech recognition. Current research heavily emphasizes end-to-end models, often employing transformer-transducer architectures and serialized output training (SOT) to handle the temporal ordering of multiple speakers' utterances, sometimes incorporating speaker diarization or visual cues. These advancements aim to improve the accuracy and efficiency of transcribing conversations and meetings, with implications for applications ranging from virtual assistants to meeting transcription services and improving accessibility for individuals with hearing impairments.
Papers
September 19, 2024
August 30, 2024
July 13, 2024
July 4, 2024
March 4, 2024
February 14, 2024
January 28, 2024
January 8, 2024
December 12, 2023
October 23, 2023
October 5, 2023
September 15, 2023
September 14, 2023
June 28, 2023
June 18, 2023
June 4, 2023
May 30, 2023
May 23, 2023
March 20, 2023
November 10, 2022