Speech Intelligibility
Speech intelligibility research focuses on understanding and improving how well spoken words are perceived, particularly in challenging acoustic conditions or for individuals with hearing impairments. Current research emphasizes developing computational models, often employing deep learning architectures like convolutional neural networks, recurrent neural networks (LSTMs), and generative adversarial networks (GANs), to enhance speech quality and predict intelligibility using various acoustic features and representations (e.g., spectrograms, MFCCs, self-supervised embeddings). These advancements have significant implications for improving assistive listening devices, language learning technologies, and human-computer interaction systems by enhancing the clarity and understandability of speech in diverse contexts.
Papers
Unsupervised Uncertainty Measures of Automatic Speech Recognition for Non-intrusive Speech Intelligibility Prediction
Zehai Tu, Ning Ma, Jon Barker
Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners
Zehai Tu, Ning Ma, Jon Barker
Disentangled Latent Speech Representation for Automatic Pathological Intelligibility Assessment
Tobias Weise, Philipp Klumpp, Kubilay Can Demir, Andreas Maier, Elmar Noeth, Bjoern Heismann, Maria Schuster, Seung Hee Yang
MTI-Net: A Multi-Target Speech Intelligibility Prediction Model
Ryandhimas E. Zezario, Szu-wei Fu, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao
Genre-conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music
Xiaoxue Gao, Chitralekha Gupta, Haizhou Li
MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids
Ryandhimas E. Zezario, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao