Audio Generation
Audio generation research focuses on creating high-quality audio from various inputs like text, images, or video, aiming for improved realism, controllability, and efficiency. Current efforts center on refining diffusion models and transformers, often incorporating large language models for enhanced semantic understanding and control, as well as exploring techniques like flow matching for faster inference. These advancements have significant implications for diverse applications, including music composition, sound effects design, accessibility technologies (like text-to-speech), and interactive media, driving innovation across multiple scientific disciplines.
Papers
January 31, 2024
January 9, 2024
December 25, 2023
November 1, 2023
October 22, 2023
October 1, 2023
September 27, 2023
September 26, 2023
September 22, 2023
September 19, 2023
September 15, 2023
August 24, 2023
August 23, 2023
August 10, 2023
July 26, 2023
July 10, 2023
June 17, 2023
May 24, 2023