Paper ID: 2402.10547
Learning Disentangled Audio Representations through Controlled Synthesis
Yusuf Brima, Ulf Krumnack, Simone Pika, Gunther Heidemann
This paper tackles the scarcity of benchmarking data in disentangled auditory representation learning. We introduce SynTone, a synthetic dataset with explicit ground truth explanatory factors for evaluating disentanglement techniques. Benchmarking state-of-the-art methods on SynTone highlights its utility for method evaluation. Our results underscore strengths and limitations in audio disentanglement, motivating future research.
Submitted: Feb 16, 2024