Multi Talker
Multi-talker speech recognition (MTASR) focuses on accurately transcribing speech containing overlapping voices, a significant challenge in automatic speech recognition. Current research heavily emphasizes end-to-end models, often employing transformer-transducer architectures and serialized output training (SOT) to handle the temporal ordering of multiple speakers' utterances, sometimes incorporating speaker diarization or visual cues. These advancements aim to improve the accuracy and efficiency of transcribing conversations and meetings, with implications for applications ranging from virtual assistants to meeting transcription services and improving accessibility for individuals with hearing impairments.
Papers
November 1, 2022
October 27, 2022
October 11, 2022
September 12, 2022
April 1, 2022
March 31, 2022
March 30, 2022
February 2, 2022
January 25, 2022