English Code Switching

English code-switching speech recognition (CSR) focuses on accurately transcribing speech containing spontaneous switches between English and another language, primarily Mandarin in recent research. Current efforts concentrate on improving the performance of models like kNN-CTC and transformer-transducer architectures, often incorporating techniques such as dual monolingual datastores, language-specific acoustic boundary learning, and data augmentation strategies to address the challenges posed by language mixing. These advancements are significant because accurate CSR is crucial for applications requiring real-time transcription of multilingual conversations, such as language learning tools and cross-cultural communication platforms.

Papers