Audio Generation
Audio generation research focuses on creating high-quality audio from various inputs like text, images, or video, aiming for improved realism, controllability, and efficiency. Current efforts center on refining diffusion models and transformers, often incorporating large language models for enhanced semantic understanding and control, as well as exploring techniques like flow matching for faster inference. These advancements have significant implications for diverse applications, including music composition, sound effects design, accessibility technologies (like text-to-speech), and interactive media, driving innovation across multiple scientific disciplines.
Papers
September 22, 2023
September 19, 2023
September 15, 2023
August 24, 2023
August 23, 2023
August 10, 2023
July 26, 2023
July 10, 2023
June 17, 2023
May 24, 2023
May 22, 2023
May 16, 2023
May 4, 2023
May 3, 2023
March 7, 2023
January 30, 2023
January 22, 2023