Inverse Text Normalization

Inverse text normalization (ITN) converts spoken-form text, often produced by automatic speech recognition (ASR) systems, into its written-form equivalent. Current research focuses on improving ITN accuracy and robustness, particularly for low-resource languages and ASR-generated text, employing neural models like transformers and leveraging techniques such as data augmentation and semi-supervised learning to address data scarcity and out-of-domain issues. These advancements are crucial for enhancing the usability and downstream processing of ASR outputs, impacting various applications including improved user experience in voice assistants and more effective natural language processing pipelines.

Papers