Acoustic Unit

Acoustic unit discovery (AUD) focuses on automatically segmenting speech into meaningful, discrete units from raw audio, without relying on pre-existing linguistic annotations. Current research emphasizes self-supervised learning methods, often employing contrastive predictive coding, HuBERT, or variations of latent Dirichlet allocation, to discover these units and leverage them for tasks like keyword search and speech translation. These advancements improve speech processing in low-resource scenarios and enhance the robustness of speech recognition systems across diverse acoustic conditions, ultimately contributing to more efficient and effective speech technology.

Papers