HiFi GAN
HiFi-GAN is a generative adversarial network (GAN)-based neural vocoder designed for high-fidelity audio synthesis, primarily focusing on generating realistic speech and singing voice waveforms from acoustic representations like mel-spectrograms. Current research emphasizes improving HiFi-GAN's performance through architectural modifications, such as incorporating source-filter models, enhanced discriminators using alternative time-frequency representations (e.g., Constant-Q Transform), and diffusion-based training methods to enhance stability and quality. These advancements aim to improve synthesis speed, quality, and controllability, impacting applications in text-to-speech systems, speech enhancement, and music generation.
Papers
September 12, 2024
July 22, 2024
April 26, 2024
January 30, 2024
November 25, 2023
September 18, 2023
April 26, 2023
March 24, 2023
November 25, 2022
October 27, 2022
October 23, 2022
July 15, 2022
April 22, 2022