Child Speech

Research on child speech focuses on improving automatic speech recognition (ASR) and text-to-speech (TTS) systems for children, addressing the challenges posed by the acoustic and linguistic differences between children's and adults' speech. Current efforts utilize deep learning architectures like Conformers, Whisper, and Wav2Vec2, often employing techniques like transfer learning, data augmentation (including synthetic speech generation), and parameter-efficient fine-tuning (e.g., LoRA) to overcome data scarcity issues. These advancements have significant implications for diagnosing developmental disorders, creating more inclusive educational technologies, and furthering our understanding of language acquisition.

Papers