Speech Intelligibility
Speech intelligibility research focuses on understanding and improving how well spoken words are perceived, particularly in challenging acoustic conditions or for individuals with hearing impairments. Current research emphasizes developing computational models, often employing deep learning architectures like convolutional neural networks, recurrent neural networks (LSTMs), and generative adversarial networks (GANs), to enhance speech quality and predict intelligibility using various acoustic features and representations (e.g., spectrograms, MFCCs, self-supervised embeddings). These advancements have significant implications for improving assistive listening devices, language learning technologies, and human-computer interaction systems by enhancing the clarity and understandability of speech in diverse contexts.
Papers
Using Speech Foundational Models in Loss Functions for Hearing Aid Speech Enhancement
Robert Sutherland, George Close, Thomas Hain, Stefan Goetze, Jon Barker
How Private is Low-Frequency Speech Audio in the Wild? An Analysis of Verbal Intelligibility by Humans and Machines
Ailin Liu, Pepijn Vunderink, Jose Vargas Quiros, Chirag Raman, Hayley Hung
On combining acoustic and modulation spectrograms in an attention LSTM-based system for speech intelligibility level classification
Ascensión Gallardo-Antolín, Juan M. Montero
An Attention Long Short-Term Memory based system for automatic classification of speech intelligibility
Miguel Fernández-Díaz, Ascensión Gallardo-Antolín