Cross Speaker
Cross-speaker modeling in speech processing aims to improve the accuracy and robustness of systems that handle multiple speakers, particularly in challenging scenarios like overlapping speech or conversations. Current research focuses on integrating single-speaker and multi-speaker models, leveraging long-range contextual information through transformer networks and graph-based methods, and developing efficient representations of cross-utterance and cross-speaker context. These advancements are leading to significant improvements in automatic speech recognition (ASR), speaker diarization, and related tasks, with potential applications in human-computer interaction, meeting transcription, and accessibility technologies.
Papers
January 8, 2024
June 29, 2023
June 23, 2023
June 21, 2023
March 27, 2023
January 19, 2023
August 27, 2022
March 30, 2022
December 23, 2021