High Fidelity Vocoder
High-fidelity vocoders are neural networks that synthesize high-quality audio waveforms from lower-dimensional acoustic representations, aiming to improve the realism and naturalness of synthetic speech. Current research focuses on enhancing vocoder efficiency and speed through architectural innovations like lightweight GANs and DDSP models, as well as improving audio quality via techniques such as feature smoothing, contrastive learning, and refined discriminators. These advancements have significant implications for applications like text-to-speech synthesis, voice conversion, and speech enhancement, offering improvements in both the speed and quality of audio generation.
Papers
January 2, 2025
November 4, 2024
September 24, 2024
September 14, 2024
September 4, 2024
June 12, 2024
June 7, 2024
May 11, 2024
April 26, 2024
March 25, 2024
March 15, 2024
February 2, 2024
January 19, 2024
November 25, 2023
October 14, 2023
October 2, 2023
September 25, 2023
September 16, 2023
September 1, 2023