Multi Speaker
Multi-speaker research focuses on developing robust systems capable of processing and understanding audio and video containing multiple simultaneous speakers. Current efforts concentrate on improving speech separation and recognition techniques, often employing deep neural networks like Conformers and Transformers, along with innovative training methods such as Serialized Output Training and speaker-aware CTC. These advancements are crucial for applications ranging from meeting transcription and voice assistants to improving accessibility for individuals with hearing impairments, driving significant progress in both speech processing and human-computer interaction.
Papers
March 30, 2022
February 10, 2022
January 19, 2022
December 19, 2021
November 19, 2021
November 17, 2021