Spoken Language Understanding
Spoken Language Understanding (SLU) focuses on enabling computers to comprehend human speech, aiming to extract meaning and intent from spoken dialogue. Current research emphasizes improving the robustness and accuracy of SLU systems, particularly in handling noisy speech, low-resource languages, and out-of-distribution data, often employing large language models (LLMs) and contrastive learning techniques within various architectures like end-to-end models and hybrid approaches combining speech encoders with LLMs. Advances in SLU are crucial for enhancing human-computer interaction in applications such as virtual assistants, improving accessibility for diverse languages, and advancing the broader field of artificial intelligence.
Papers
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems
Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas, Hong-Kwang J. Kuo, Brian Kingsbury
Building an ASR Error Robust Spoken Virtual Patient System in a Highly Class-Imbalanced Scenario Without Speech Data
Vishal Sunder, Prashant Serai, Eric Fosler-Lussier
Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding
Vishal Sunder, Samuel Thomas, Hong-Kwang J. Kuo, Jatin Ganhotra, Brian Kingsbury, Eric Fosler-Lussier