Video to Audio Generation
Video-to-audio generation aims to synthesize realistic and temporally aligned audio from silent video, enhancing multimedia experiences and automating sound effects creation. Current research focuses on improving the quality, semantic consistency, and temporal synchronization of generated audio, employing various model architectures including diffusion models, autoregressive models, and those leveraging large language models for multimodal understanding. These advancements are significant for applications such as video editing, post-production, and virtual/augmented reality, offering more efficient and creative audio production workflows.
Papers
November 8, 2024
October 17, 2024
September 27, 2024
September 23, 2024
September 20, 2024
September 13, 2024
August 21, 2024
July 10, 2024
July 8, 2024
July 1, 2024
June 23, 2024
June 13, 2024
June 1, 2024
May 25, 2024
April 25, 2024
December 8, 2023
September 8, 2023
June 29, 2023