Room Impulse Response
A room impulse response (RIR) characterizes how sound propagates within an enclosed space, detailing the time delays and amplitude changes of reflections. Current research focuses on accurately estimating RIRs from various audio and visual inputs using deep learning models, including convolutional neural networks, transformers, and diffusion models, often incorporating techniques like contrastive learning and beamforming. These advancements are crucial for improving applications such as speech enhancement, sound event localization, virtual and augmented reality, and acoustic scene analysis, by enabling more realistic audio processing and rendering. Furthermore, research explores the inverse problem—inferring room geometry and material properties from measured RIRs—using both model-based and data-driven approaches.
Papers
Spatial Scaper: A Library to Simulate and Augment Soundscapes for Sound Event Localization and Detection in Realistic Rooms
Iran R. Roman, Christopher Ick, Sivan Ding, Adrian S. Roman, Brian McFee, Juan P. Bello
3D Room Geometry Inference from Multichannel Room Impulse Response using Deep Neural Network
Inmo Yeon, Jung-Woo Choi