Audio Visual Correspondence
Audio-visual correspondence research focuses on establishing robust links between audio and visual information in videos, aiming to understand how sounds relate to their visual sources. Current efforts concentrate on improving the accuracy and efficiency of audio-visual segmentation, often employing transformer-based architectures and self-supervised learning techniques to handle complex scenes with multiple sound sources and noisy data. This field is crucial for advancing applications such as video indexing, sound source localization, and multimodal understanding, ultimately leading to more sophisticated and realistic human-computer interaction.
Papers
July 16, 2024
June 10, 2024
May 22, 2024
April 8, 2024
April 2, 2024
March 5, 2024
November 7, 2023
October 30, 2023
October 11, 2023
August 20, 2023
July 10, 2023
June 16, 2023
September 16, 2022
June 2, 2022
March 10, 2022