Sound Event Detection

Sound event detection (SED) aims to automatically identify and locate sounds within audio recordings, a crucial task with applications in environmental monitoring, assistive technologies, and smart homes. Current research heavily emphasizes improving SED's robustness to overlapping sounds and noisy environments, often employing transformer-based architectures like Audio Spectrogram Transformers (ASTs) and incorporating techniques like self-supervised learning and multi-modal data fusion (audio and visual). These advancements are driving progress towards more accurate and efficient SED systems, impacting fields ranging from biodiversity monitoring to improved human-computer interaction.

Papers