Electrolaryngeal Speech
Electrolaryngeal speech (ELS) research focuses on improving the intelligibility and naturalness of speech produced by electrolarynx devices, used by individuals who have lost their vocal cords. Current research employs deep learning models, including diffusion probabilistic models and sequence-to-sequence voice conversion, often incorporating multimodal data (audio and visual) and leveraging techniques like self-supervised learning and intermediate fine-tuning to address data scarcity and domain mismatch challenges. These advancements aim to significantly enhance the quality of life for ELS users by creating more natural and understandable speech, impacting both assistive technology and speech science.
Papers
Mandarin Electrolaryngeal Speech Voice Conversion using Cross-domain Features
Hsin-Hao Chen, Yung-Lun Chien, Ming-Chi Yen, Shu-Wei Tsai, Yu Tsao, Tai-shih Chi, Hsin-Min Wang
Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion
Yung-Lun Chien, Hsin-Hao Chen, Ming-Chi Yen, Shu-Wei Tsai, Hsin-Min Wang, Yu Tsao, Tai-Shih Chi