Spoken Language Understanding
Spoken Language Understanding (SLU) focuses on enabling computers to comprehend human speech, aiming to extract meaning and intent from spoken dialogue. Current research emphasizes improving the robustness and accuracy of SLU systems, particularly in handling noisy speech, low-resource languages, and out-of-distribution data, often employing large language models (LLMs) and contrastive learning techniques within various architectures like end-to-end models and hybrid approaches combining speech encoders with LLMs. Advances in SLU are crucial for enhancing human-computer interaction in applications such as virtual assistants, improving accessibility for diverse languages, and advancing the broader field of artificial intelligence.
Papers
UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions
Siddhant Arora, Hayato Futami, Jee-weon Jung, Yifan Peng, Roshan Sharma, Yosuke Kashiwagi, Emiru Tsunoo, Karen Livescu, Shinji Watanabe
Continual Contrastive Spoken Language Understanding
Umberto Cappellazzo, Enrico Fini, Muqiao Yang, Daniele Falavigna, Alessio Brutti, Bhiksha Raj
I$^2$KD-SLU: An Intra-Inter Knowledge Distillation Framework for Zero-Shot Cross-Lingual Spoken Language Understanding
Tianjun Mao, Chenghong Zhang