Unpaired Speech
Unpaired speech data, consisting of speech and text recordings without corresponding pairings, is increasingly used to train speech processing models, addressing the limitations of data scarcity and annotation costs. Current research focuses on leveraging unpaired data through techniques like generative adversarial networks (GANs), diffusion models, and self-supervised pre-training methods, often incorporating transformer architectures to improve speech recognition, synthesis, and voice conversion. These advancements are particularly impactful for low-resource languages and applications where paired data is difficult or expensive to obtain, leading to more robust and widely accessible speech technologies.
Papers
September 9, 2024
August 10, 2024
June 4, 2024
October 30, 2022
October 20, 2022
July 29, 2022
May 3, 2022
April 7, 2022
April 5, 2022