Speech Signal
Speech signals are the acoustic representations of spoken language, and research focuses on improving their processing for various applications. Current efforts concentrate on developing robust models for speech enhancement (e.g., using diffusion models and state-space models like Mamba), source separation (leveraging techniques like attention mechanisms and incorporating spatial information), and accurate recognition, even in noisy or challenging environments. These advancements have significant implications for improving human-computer interaction, assistive technologies for individuals with hearing impairments, and applications in healthcare (e.g., disease detection using speech biomarkers) and security (e.g., synthetic speech detection).
Papers
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition
Zhisheng Zheng, Ziyang Ma, Yu Wang, Xie Chen
Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads
Salah Zaiem, Youcef Kemiche, Titouan Parcollet, Slim Essid, Mirco Ravanelli
Investigation of Self-supervised Pre-trained Models for Classification of Voice Quality from Speech and Neck Surface Accelerometer Signals
Sudarsana Reddy Kadiri, Farhad Javanmardi, Paavo Alku
Characterization of cough sounds using statistical analysis
Naveenkumar Vodnala, Pratap Reddy Lankireddy, Padmasai Yarlagadda