Audio Generation
Audio generation research focuses on creating high-quality audio from various inputs like text, images, or video, aiming for improved realism, controllability, and efficiency. Current efforts center on refining diffusion models and transformers, often incorporating large language models for enhanced semantic understanding and control, as well as exploring techniques like flow matching for faster inference. These advancements have significant implications for diverse applications, including music composition, sound effects design, accessibility technologies (like text-to-speech), and interactive media, driving innovation across multiple scientific disciplines.
Papers
November 6, 2022
September 30, 2022
September 7, 2022
July 20, 2022
June 14, 2022
April 18, 2022
April 14, 2022
February 20, 2022
January 25, 2022
January 7, 2022