Audio Encoder
Audio encoders are neural networks designed to transform raw audio waveforms into meaningful numerical representations, facilitating various downstream tasks like speech recognition, sound classification, and audio-visual integration. Current research emphasizes self-supervised learning techniques, often employing masked autoencoders or contrastive learning, to train robust encoders on massive, diverse audio datasets, including multi-channel and low-resource scenarios. These advancements are improving the accuracy and generalizability of audio processing systems across diverse applications, from real-time speech enhancement to more nuanced tasks like audio-guided image manipulation and semantic audio decomposition.
Papers
November 12, 2024
September 14, 2024
August 6, 2024
July 10, 2024
June 19, 2024
June 11, 2024
September 28, 2023
September 8, 2023
May 28, 2023
May 11, 2023
March 4, 2023
February 6, 2023
December 12, 2022
December 8, 2022
November 12, 2022
November 9, 2022
October 4, 2022
June 1, 2022
April 1, 2022