Audio Generation
Audio generation research focuses on creating high-quality audio from various inputs like text, images, or video, aiming for improved realism, controllability, and efficiency. Current efforts center on refining diffusion models and transformers, often incorporating large language models for enhanced semantic understanding and control, as well as exploring techniques like flow matching for faster inference. These advancements have significant implications for diverse applications, including music composition, sound effects design, accessibility technologies (like text-to-speech), and interactive media, driving innovation across multiple scientific disciplines.
Papers
January 6, 2025
December 30, 2024
December 14, 2024
December 12, 2024
November 26, 2024
November 23, 2024
November 13, 2024
November 8, 2024
October 31, 2024
October 18, 2024
October 15, 2024
October 14, 2024
October 4, 2024
October 3, 2024
September 5, 2024
September 3, 2024
August 30, 2024
August 2, 2024
July 19, 2024