Speech Data Augmentation

Speech data augmentation aims to enhance the performance of speech processing systems by artificially increasing the size and diversity of training datasets. Current research focuses on developing novel augmentation techniques, such as phase perturbation and adversarial methods using generative models like VAEs and GANs, complementing established methods like speed and vocal tract length perturbation. These advancements improve robustness and accuracy across various tasks, including automatic speech recognition, speaker recognition, and emotion recognition, ultimately leading to more reliable and effective speech technologies.

Papers