Silent Speech

Silent speech interfaces (SSIs) aim to translate articulatory movements, such as lip movements or tongue positions, into spoken words without audible vocalization. Current research heavily utilizes deep learning, particularly employing contrastive learning and spatial transformer networks within neural networks, to improve the accuracy and robustness of these systems across different speakers and recording conditions. This technology holds significant promise for applications requiring private communication or hands-free control of devices, particularly in noisy environments or situations where vocalization is undesirable. Ongoing efforts focus on enhancing model adaptability and expanding vocabulary size for more natural and expressive silent communication.

Papers

September 11, 2023

Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP
Jinzuomu Zhong, Yang Li, Hui Huang, Korin Richmond, Jie Liu, Zhiba Su, Jing Guo, Benlai Tang, Fengjie Zhu
Prosodic Feature Contrastive Pretraining Prosody Modeling Controllable Text to Speech Silent Speech

May 30, 2023

Adaptation of Tongue Ultrasound-Based Silent Speech Interfaces Using Spatial Transformer Networks
László Tóth, Amin Honarmandi Shandiz, Gábor Gosztolya, Csapó Tamás Gábor
Deep Network Adaptation Concern Spatial Transformer Network Articulatory Data Silent Speech

March 3, 2023

SottoVoce: An Ultrasound Imaging-Based Silent Speech Interaction Using Deep Neural Networks
Naoki Kimura, Michinari Kono, Jun Rekimoto
Deep Neural Network Speech Recognition Acoustic Feature Speech Utterance Smart Speaker Silent Speech

February 12, 2023

LipLearner: Customizable Silent Speech Interactions on Mobile Devices
Zixiong Su, Shitao Fang, Jun Rekimoto
Mobile Device Silent Speech Interface Silent Speech

Silent Speech

Papers

Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP

Adaptation of Tongue Ultrasound-Based Silent Speech Interfaces Using Spatial Transformer Networks

SottoVoce: An Ultrasound Imaging-Based Silent Speech Interaction Using Deep Neural Networks

LipLearner: Customizable Silent Speech Interactions on Mobile Devices