Audio Generation
Audio generation research focuses on creating high-quality audio from various inputs like text, images, or video, aiming for improved realism, controllability, and efficiency. Current efforts center on refining diffusion models and transformers, often incorporating large language models for enhanced semantic understanding and control, as well as exploring techniques like flow matching for faster inference. These advancements have significant implications for diverse applications, including music composition, sound effects design, accessibility technologies (like text-to-speech), and interactive media, driving innovation across multiple scientific disciplines.
Papers
May 22, 2023
May 16, 2023
May 4, 2023
May 3, 2023
March 7, 2023
January 30, 2023
January 22, 2023
November 19, 2022
November 6, 2022
September 30, 2022
September 7, 2022
July 20, 2022
June 14, 2022
April 18, 2022
April 14, 2022
February 20, 2022
January 25, 2022