Based Audio Retrieval
Based audio retrieval focuses on efficiently finding specific audio segments within large datasets using various query types, including text descriptions and example audio snippets. Current research emphasizes improving the accuracy and robustness of retrieval, particularly for challenging scenarios like noisy audio or rare words, often employing transformer-based architectures and contrastive learning methods to generate effective audio embeddings. These advancements are crucial for improving applications ranging from audio indexing and search to more sophisticated multimodal tasks like audio-visual video segmentation and direct speech translation.
Papers
October 29, 2024
October 21, 2024
October 10, 2024
September 24, 2024
September 13, 2024
September 1, 2024
September 18, 2023
June 16, 2023
February 23, 2023
October 16, 2022