Speech Application

Speech application research focuses on improving the accuracy and robustness of systems processing spoken language for various tasks, including speech recognition, synthesis, emotion recognition, and speaker diarization. Current efforts concentrate on developing more efficient and generalizable models, exploring architectures like transformers and state-space models, and addressing challenges such as handling noisy or overlapping speech, cross-dataset generalization, and ensuring fairness and privacy. These advancements have significant implications for human-computer interaction, healthcare (e.g., COVID-19 diagnosis), and accessibility technologies, driving progress in both fundamental understanding of speech and practical applications.

Papers