Paper ID: 2410.22903

Augmenting Polish Automatic Speech Recognition System With Synthetic Data

Łukasz Bondaruk, Jakub Kubiak, Mateusz Czyżnikiewicz

This paper presents a system developed for submission to Poleval 2024, Task 3: Polish Automatic Speech Recognition Challenge. We describe Voicebox-based speech synthesis pipeline and utilize it to augment Conformer and Whisper speech recognition models with synthetic data. We show that addition of synthetic speech to training improves achieved results significantly. We also present final results achieved by our models in the competition.

Submitted: Oct 30, 2024

Topics

Language Model
Synthetic Data
Raw Data
Synthesized Speech
Large Scale Synthetic
Speech Recognition Model
Polish Language

Links

arXiv PDF