Input Spectrogram
An input spectrogram is a visual representation of audio data, crucial for various audio processing tasks. Current research focuses on leveraging spectrograms with deep learning models, particularly Convolutional Neural Networks (CNNs) and Transformers, to improve performance in applications like sound event detection, audio deepfake detection, and singing voice conversion. These advancements address challenges such as device variability, computational efficiency, and robustness to noise and spoofing attacks, leading to more accurate and reliable audio analysis systems. The resulting improvements have significant implications for fields ranging from audio forensics to assistive technologies.
Papers
July 1, 2024
March 20, 2024
January 16, 2024
January 12, 2024
August 18, 2023
June 20, 2023
March 31, 2023
November 2, 2022
July 13, 2022
March 1, 2022
December 1, 2021