Voice Based

Voice-based interfaces are revolutionizing human-computer interaction, aiming to create more natural and intuitive communication with machines across diverse applications, from surgical robots to smart home devices. Current research emphasizes improving accuracy and efficiency of speech recognition and natural language understanding, often employing deep learning models like large language models (LLMs) and convolutional neural networks (CNNs) within various pipeline architectures (e.g., direct voice-to-function mapping, STT+LLM). This field is significant for its potential to enhance accessibility for individuals with disabilities, improve efficiency in various professional settings, and create more engaging and user-friendly experiences across numerous technological domains.

Papers

October 27, 2024

Towards an LLM-Based Speech Interface for Robot-Assisted Feeding
Jessie Yuan, Janavi Gupta, Akhil Padmanabha, Zulekha Karachiwalla, Carmel Majidi, Henny Admoni, Zackory Erickson
Assistive Robot Voice Based Robot Assisted Feeding

September 16, 2024

Voice control interface for surgical robot assistants
Ana Davila, Jacinto Colan, Yasuhisa Hasegawa
Surgical Robot Voice Based Robotic Assistant Manipulator Control Robot Assistant

July 10, 2024

Evaluating Voice Command Pipelines for Drone Control: From STT and LLM to Direct Classification and Siamese Networks
Lucca Emmanuel Pineli Simões, Lucas Brandão Rodrigues, Rafaela Mota Silva, Gustavo Rodrigues da Silva
Classification Code Speech Recognition Siamese Network Voice Based Drone Control Processing Pipeline Tello Drone Speech Mapping

April 23, 2024

Qualitative Approaches to Voice UX
Katie Seaborn, Jacqueline Urakami, Peter Pennefather, Norihisa P. Miyake
Qualitative Analysis User Experience Voice Based Qualitative Methodology

April 5, 2024

VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots
Akhil Padmanabha, Jessie Yuan, Janavi Gupta, Zulekha Karachiwalla, Carmel Majidi, Henny Admoni, Zackory Erickson
Large Language Model Non Humanoid Robot Assistive Robot Voice Based

April 1, 2024

Dialogue with Robots: Proposals for Broadening Participation and Research in the SLIVAR Community
Casey Kennington, Malihe Alikhani, Heather Pon-Barry, Katherine Atwell, Yonatan Bisk, Daniel Fried, Felix Gervits, Zhao Han, Mert Inan, Michael Johnston, Raj Korpan, Diane Litman, Matthew Marge, Cynthia Matuszek, Ross Mead, Shiwali Mohan, Raymond Mooney, Natalie Parde, Jivko Sinapov, Angela Stewart, Matthew Stone, Stefanie Tellex, Tom Williams
Natural Language Non Humanoid Robot Dialogue Utterance Proposal Balance Refinement Voice Based Spoken Dialogue

February 23, 2024

Hands-Free VR
Jorge Askur Vazquez Fernandez, Jae Joong Lee, Santiago Andrés Serrano Vacca, Alejandra Magana, Radim Pesam, Bedrich Benes, Voicu Popescu
Virtual Reality Voice Based Hand Free

February 7, 2024

Large Language User Interfaces: Voice Interactive User Interfaces powered by LLMs
Syed Mekael Wasti, Ken Q. Pu, Ali Neshati
Large Language Model Language Understanding Real Time Semantic Matching Voice Based Language Interface

January 25, 2024

Alternative Interfaces for Human-initiated Natural Language Communication and Robot-initiated Haptic Feedback: Towards Better Situational Awareness in Human-Robot Collaboration
Callum Bennie, Bridget Casey, Cecile Paris, Dana Kulic, Brendan Tidd, Nicholas Lawrance, Alex Pitt, Fletcher Talbot, Jason Williams, David Howard, Pavan Sikka, Hashini Senaratne
Human Robot Collaboration Autonomous Exploration Situational Awareness SpOT Robot Voice Based Natural Language Communication Interactive Interface Haptic Guidance Human Supervisor

December 31, 2023

E-chat: Emotion-sensitive Spoken Dialogue System with Large Language Models
Hongfei Xue, Yuhao Liang, Bingshen Mu, Shiliang Zhang, Mengzhe Chen, Qian Chen, Lei Xie
Emotional Speech Voice Based E Chat Emotional Dialogue

December 10, 2023

A Practical Survey on Emerging Threats from AI-driven Voice Attacks: How Vulnerable are Commercial Voice Control Systems?
Yuanda Wang, Qiben Yan, Nikolay Ivanov, Xun Chen
Comprehensive Survey Latent Vulnerability Threat Word Voice Based Voice Activated Audio Attack Voice Cloning Attack

October 17, 2023

Robust Wake-Up Word Detection by Two-stage Multi-resolution Ensembles
Fernando López, Jordi Luque, Carlos Segura, Pablo Gómez
Voice Based Device Use Case Wake Word Audio Feature Audio Classifier

October 6, 2023

Keyword Augmented Retrieval: Novel framework for Information Retrieval integrated with speech interface
Anupam Purwar, Rahul Sundar
Language Model Information Retrieval Retrieval Augmented Novel Framework Knowledge Retrieval Voice Based

September 20, 2023

Development of a Feeding Assistive Robot Using a Six Degree of Freedom Robotic Arm
Md Esharuzzaman Emu, Samarjith Biswas, Rajendra Shrestha
Development Activity Robotic Arm Different Degree Voice Based Robot Arm Robot Assisted Feeding Arm Movement Arduino Nano Microcontroller Actuates

September 4, 2023

Working with Trouble and Failures in Conversation between Humans and Robots (WTF 2023) & Is CUI Design Ready Yet?
Frank Förster, Marta Romeo, Patrick Holthaus, Maria Jose Galvez Trigo, Joel E. Fischer, Birthe Nesset, Christian Dondrup, Christine Murad, Cosmin Munteanu, Benjamin R. Cowan, Leigh Clark, Martin Porcheron, Heloisa Candello, Raina Langevin
Non Humanoid Robot Dialogue System Potential Conversation Outcome Leg Failure Voice Based Conversational Interface Robotic Interface Sex Trouble

June 30, 2023

Beyond-Voice: Towards Continuous 3D Hand Pose Tracking on Commercial Home Assistant Devices
Yin Li, Rohan Reddy, Cheng Zhang, Rajalakshmi Nandakumar
High Fidelity Human VOICE Hand Pose Estimation Voice Based Acoustic Sensing Home Assistant

June 14, 2023

Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions, and Prospects
Xinghua Qu, Hongyang Liu, Zhu Sun, Xiang Yin, Yew Soon Ong, Lu Lu, Zejun Ma
Data Set Visionary ProSpect Voice Based Voice Activated Feasible Solution Voice Interaction

June 9, 2023

Challenges and Opportunities for the Design of Smart Speakers
Tao Long, Lydia B. Chilton
Technical Challenge Product Design Emerging Opportunity Speech Technology Voice Based Smart Speaker

May 9, 2023

Zero-shot personalized lip-to-speech synthesis with face image based voice control
Zheng-Yan Sheng, Yang Ai, Zhen-Hua Ling
Zero Shot Face Image Speaker Embeddings Voice Based Lip to Speech Synthesis Voice Identity Lip to Speech

March 24, 2023

Voice-Based Conversational Agents and Knowledge Graphs for Improving News Search in Assisted Living
Phillip Schneider, Nils Rehtanz, Kristiina Jokinen, Florian Matthes
Knowledge Graph Conversational Agent Voice Based

Voice Based

Papers

Towards an LLM-Based Speech Interface for Robot-Assisted Feeding

Voice control interface for surgical robot assistants

Evaluating Voice Command Pipelines for Drone Control: From STT and LLM to Direct Classification and Siamese Networks

Qualitative Approaches to Voice UX

VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots

Dialogue with Robots: Proposals for Broadening Participation and Research in the SLIVAR Community

Hands-Free VR

Large Language User Interfaces: Voice Interactive User Interfaces powered by LLMs

Alternative Interfaces for Human-initiated Natural Language Communication and Robot-initiated Haptic Feedback: Towards Better Situational Awareness in Human-Robot Collaboration

E-chat: Emotion-sensitive Spoken Dialogue System with Large Language Models

A Practical Survey on Emerging Threats from AI-driven Voice Attacks: How Vulnerable are Commercial Voice Control Systems?

Robust Wake-Up Word Detection by Two-stage Multi-resolution Ensembles

Keyword Augmented Retrieval: Novel framework for Information Retrieval integrated with speech interface

Development of a Feeding Assistive Robot Using a Six Degree of Freedom Robotic Arm

Working with Trouble and Failures in Conversation between Humans and Robots (WTF 2023) & Is CUI Design Ready Yet?

Beyond-Voice: Towards Continuous 3D Hand Pose Tracking on Commercial Home Assistant Devices

Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions, and Prospects

Challenges and Opportunities for the Design of Smart Speakers

Zero-shot personalized lip-to-speech synthesis with face image based voice control

Voice-Based Conversational Agents and Knowledge Graphs for Improving News Search in Assisted Living