Text Normalization
Text normalization aims to standardize text by converting non-standard forms (like numerals, abbreviations, and informal spellings) into consistent, canonical representations. Current research focuses on improving normalization accuracy for low-resource languages and less frequent terms, employing techniques like weakly supervised learning, transformer-based language models, and rule-guided neural architectures. These advancements are crucial for enhancing the performance of various natural language processing tasks, including speech recognition, machine translation, and information retrieval, particularly in domains with diverse or historically-influenced writing styles.
Papers
December 18, 2024
December 12, 2024
October 26, 2024
October 14, 2024
September 30, 2024
September 11, 2024
September 4, 2024
August 29, 2024
August 22, 2024
April 30, 2024
April 21, 2024
February 20, 2024
February 4, 2024
January 29, 2024
December 29, 2023
November 12, 2023
September 23, 2023
May 25, 2023
May 11, 2023