Sound Event Detection
Sound event detection (SED) aims to automatically identify and locate sounds within audio recordings, a crucial task with applications in environmental monitoring, assistive technologies, and smart homes. Current research heavily emphasizes improving SED's robustness to overlapping sounds and noisy environments, often employing transformer-based architectures like Audio Spectrogram Transformers (ASTs) and incorporating techniques like self-supervised learning and multi-modal data fusion (audio and visual). These advancements are driving progress towards more accurate and efficient SED systems, impacting fields ranging from biodiversity monitoring to improved human-computer interaction.
Papers
Optimizing Temporal Resolution Of Convolutional Recurrent Neural Networks For Sound Event Detection
Wim Boes, Hugo Van hamme
A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4
Yiming Li, Zhifang Guo, Zhirong Ye, Xiangdong Wang, Hong Liu, Yueliang Qian, Rui Tao, Long Yan, Kazushige Ouchi