Speech Data

Speech data research focuses on developing and improving methods for analyzing and utilizing spoken language, primarily for applications like automatic speech recognition (ASR), speech synthesis, and speaker verification. Current research emphasizes the development of robust models, often employing deep learning architectures such as Conformers and Transformers, trained on massive multilingual datasets, including both labeled and unlabeled data, sometimes augmented with synthetic speech. This field is crucial for advancing human-computer interaction, improving accessibility for individuals with disabilities, and enabling new diagnostic tools in healthcare, particularly for mental health and neurological disorders.

Papers