Sound Event Localization and Detection

Sound event localization and detection (SELD) aims to identify the type and location of sounds within an audio scene, a challenging task due to the complexity of real-world acoustics. Current research focuses on improving the generalization ability of SELD systems across diverse environments, often employing meta-learning techniques and self-supervised pre-training with architectures like wav2vec 2.0 adaptations to leverage unlabeled data. These advancements address the limitations of data scarcity and the high cost of annotation, leading to more robust and adaptable SELD systems with potential applications in areas such as environmental monitoring, assistive technologies, and robotics.

Papers