Audio Feature

Audio features are representations of sound data used in various applications, from speech recognition to music analysis and sound event detection. Current research focuses on developing robust and efficient methods for extracting these features, often employing deep learning models like convolutional neural networks (CNNs) and transformers, sometimes in conjunction with multimodal fusion incorporating visual data. These advancements are driving improvements in numerous fields, including enhanced audio deepfake detection, more accurate keyword spotting, and improved music information retrieval systems. The development of effective audio features is crucial for advancing the capabilities of computer audition and its diverse applications.

Papers