Voice Assistant
Voice assistants are rapidly evolving interfaces aiming to provide natural and intuitive human-computer interaction, primarily through speech. Current research focuses on improving accuracy and robustness across diverse demographics and contexts, employing techniques like end-to-end speech LLMs, multimodal fusion models (combining audio and visual data), and personalized ASR models to enhance performance and reduce latency. These advancements are significant for improving accessibility, user experience, and security in various applications, from smart homes and vehicles to assistive technologies for individuals with disabilities.
Papers
Follow-on Question Suggestion via Voice Hints for Voice Assistants
Besnik Fetahu, Pedro Faustini, Giuseppe Castellucci, Anjie Fang, Oleg Rokhlenko, Shervin Malmasi
STEER: Semantic Turn Extension-Expansion Recognition for Voice Assistants
Leon Liyang Zhang, Jiarui Lu, Joel Ruben Antony Moniz, Aditya Kulkarni, Dhivya Piraviperumal, Tien Dung Tran, Nicholas Tzou, Hong Yu