Audio Event

Audio event recognition (AER) focuses on identifying and classifying sounds within an audio recording, aiming to accurately reflect human perception of auditory scenes. Current research emphasizes improving model performance through multi-task learning frameworks, leveraging pre-trained models and exploring high-level acoustic representations to reduce computational costs and enhance semantic understanding. This field is crucial for applications ranging from human-robot interaction and industrial automation to improving accessibility and understanding of urban soundscapes, with a growing focus on aligning model outputs with human perception of sound importance and annoyance.

Papers