Speech Domain

The speech domain encompasses the scientific study and technological application of human speech, aiming to understand and replicate its complexities for tasks like speech recognition, synthesis, and enhancement. Current research heavily utilizes deep learning, focusing on self-supervised learning (SSL) models like Wav2vec 2.0 and HuBERT, along with transformer architectures and linear attention substitutes to improve efficiency and reduce computational costs. These advancements are driving progress in areas such as low-resource language processing, robust speech recognition in noisy environments, and personalized speech technologies, with significant implications for accessibility, healthcare, and human-computer interaction.

Papers