Accented Speech

Accented speech research focuses on improving automatic speech recognition (ASR) and text-to-speech (TTS) systems' performance for speakers with diverse accents, aiming to create more inclusive and equitable technologies. Current research heavily utilizes deep learning models, including Conformers, Wav2Vec 2.0, and various sequence-to-sequence architectures, often incorporating techniques like data augmentation (synthetic speech generation, pseudo-labeling), multi-modal learning, and meta-learning to address data scarcity and improve generalization across accents. This work is significant because it directly impacts the accessibility and usability of speech technologies for a wider population, particularly those whose accents are underrepresented in training data, and has implications for healthcare, education, and other fields relying on accurate speech processing.

Papers