Sound Event Detection

Sound event detection (SED) aims to automatically identify and locate sounds within audio recordings, a crucial task with applications in environmental monitoring, assistive technologies, and smart homes. Current research heavily emphasizes improving SED's robustness to overlapping sounds and noisy environments, often employing transformer-based architectures like Audio Spectrogram Transformers (ASTs) and incorporating techniques like self-supervised learning and multi-modal data fusion (audio and visual). These advancements are driving progress towards more accurate and efficient SED systems, impacting fields ranging from biodiversity monitoring to improved human-computer interaction.

Papers

June 20, 2023

Frequency & Channel Attention for Computationally Efficient Sound Event Detection
Hyeonuk Nam, Seong-Hu Kim, Deokki Min, Yong-Hwa Park
Attention Based Channel Attention High Frequency Sound Event Detection Frequency Aware Convolution Squeeze and Excitation

June 10, 2023

Semi-supervsied Learning-based Sound Event Detection using Freuqency Dynamic Convolution with Large Kernel Attention for DCASE Challenge 2023 Task 4
Ji Won Kim, Sang Won Son, Yoonah Song, Hong Kook Kim, Il Hoon Song, Jeong Eun Lim
Related Task Sound Event Detection Kernel Attention Convolutional Recurrent Audio Transformer Polyphonic Sound Frequency Aware Convolution

June 2, 2023

Enhance Temporal Relations in Audio Captioning with Sound Event Detection
Zeyu Xie, Xuenan Xu, Mengyue Wu, Kai Yu
Audio Captioning Sound Event Detection Temporal Relation

March 13, 2023

HiSSNet: Sound Event Detection and Speaker Identification via Hierarchical Prototypical Networks for Low-Resource Headphones
N Shashaank, Berker Banar, Mohammad Rasool Izadi, Jeremy Kemmerer, Shuo Zhang, Chuan-Che Huang
Sound Event Detection Prototypical Network Speaker Identification Resource Constrained Device Audio Detection

March 10, 2023

Improving Weakly Supervised Sound Event Detection with Causal Intervention
Yifei Xin, Dongchao Yang, Fan Cui, Yujun Wang, Yuexian Zou
Structural Causal Model Sound Event Detection Sound Event Causal Intervention CLIP Level

March 7, 2023

AST-SED: An Effective Sound Event Detection Method Based on Audio Spectrogram Transformer
Kang Li, Yan Song, Li-Rong Dai, Ian McLoughlin, Xin Fang, Lin Liu
Large Scale Event Detection Sound Event Detection Audio Spectrogram Transformer Polyphonic Sound

February 28, 2023

Training sound event detection with soft labels from crowdsourced annotations
Irene Martín-Morató, Manu Harju, Paul Ahokas, Annamaria Mesaros
Event Detection Soft Label Label Distribution Sound Event Detection Sound Event Crowdsourced Annotation Hard Label

February 20, 2023

Improving Speech Enhancement via Event-based Query
Yifei Xin, Xiulian Peng, Yan Lu
Speech Enhancement Query Information Speaker Embeddings Speech Quality Sound Event Detection Sound Event

February 18, 2023

Multi-dimensional frequency dynamic convolution with confident mean teacher for sound event detection
Shengchang Xiao, Xueshuai Zhang, Pengyuan Zhang
Good Teacher Sound Event Detection Dimensional Attention Frequency Band Attention

January 5, 2023

Automatic Sound Event Detection and Classification of Great Ape Calls Using Neural Networks
Zifan Jiang, Adrian Soldati, Isaac Schamberg, Adriano R. Lameira, Steven Moran
Neural Network Classification Code LSTM Network Event Detection Sound Event Detection Speech Detection Great Ape

November 18, 2022

Impact of visual assistance for automated audio captioning
Wim Boes, Hugo Van hamme
Global Impact Audio Captioning Sound Event Detection Video Data Visual Instruction Visual Embeddings Captioning Metric

October 27, 2022

On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors
Zaharah Bukhsh, Aaqib Saeed
Automatic Speech Recognition Distribution Detection Speaker Diarization Audio Driven Sound Event Detection Multi Class Out of Distribution

October 18, 2022

September 26, 2022

September 13, 2022

Binaural Signal Representations for Joint Sound Event Detection and Acoustic Scene Classification
Daniel Aleksander Krause, Annamaria Mesaros
Sound Event Detection Spatial Audio Acoustic Scene Classification Binaural Audio Signal Independent Binaural Reproduction Scheme Scene Analysis

September 5, 2022

Sound Event Localization and Detection for Real Spatial Sound Scenes: Event-Independent Network and Data Augmentation Chains
Jinbo Hu, Yin Cao, Ming Wu, Qiuqiang Kong, Feiran Yang, Mark D. Plumbley, Jun Yang
Neural Network Data Augmentation Data Detection Sound Event Detection Spatial Audio Room Impulse Response Sound Event Sound Event Localization

August 17, 2022

Domestic sound event detection by shift consistency mean-teacher training and adversarial domain adaptation
Fang-Ching Chen, Kuan-Dar Chen, Yi-Wen Liu
Domain Adaptation Semi Supervised Learning Sound Event Detection Adversarial Domain Adaptation Consistency Training

July 13, 2022

Polyphonic sound event detection for highly dense birdsong scenes
Alberto García Arroba Parrilla, Dan Stowell
Sound Event Detection Acoustic Environment Audio Signal Convolutional Recurrent Polyphonic Sound

Sound Event Detection

Papers

Frequency & Channel Attention for Computationally Efficient Sound Event Detection

Semi-supervsied Learning-based Sound Event Detection using Freuqency Dynamic Convolution with Large Kernel Attention for DCASE Challenge 2023 Task 4

Enhance Temporal Relations in Audio Captioning with Sound Event Detection

HiSSNet: Sound Event Detection and Speaker Identification via Hierarchical Prototypical Networks for Low-Resource Headphones

Improving Weakly Supervised Sound Event Detection with Causal Intervention

AST-SED: An Effective Sound Event Detection Method Based on Audio Spectrogram Transformer

Training sound event detection with soft labels from crowdsourced annotations

Improving Speech Enhancement via Event-based Query

Multi-dimensional frequency dynamic convolution with confident mean teacher for sound event detection

Automatic Sound Event Detection and Classification of Great Ape Calls Using Neural Networks

Impact of visual assistance for automated audio captioning

On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors

Optimizing Temporal Resolution Of Convolutional Recurrent Neural Networks For Sound Event Detection

A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4

Impact of temporal resolution on convolutional recurrent networks for audio tagging and sound event detection

Multi-encoder attention-based architectures for sound recognition with partial visual assistance

Binaural Signal Representations for Joint Sound Event Detection and Acoustic Scene Classification

Sound Event Localization and Detection for Real Spatial Sound Scenes: Event-Independent Network and Data Augmentation Chains

Domestic sound event detection by shift consistency mean-teacher training and adversarial domain adaptation

Polyphonic sound event detection for highly dense birdsong scenes