Word Level
Word-level analysis in natural language processing focuses on understanding and leveraging information contained within individual words to improve various downstream tasks. Current research emphasizes the synergistic relationship between word-level and higher-level (e.g., sentence, document) analyses, employing techniques like attention mechanisms, Siamese networks, and hierarchical models (e.g., incorporating character-level information) to enhance representation learning. These advancements are improving performance in applications such as writer identification, machine translation, and speech emotion recognition, particularly in scenarios with limited data or code-mixed languages. The insights gained are also informing our understanding of human language processing by revealing how different brain regions process semantic and syntactic information at the word level.
Papers
From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding
Li Sun, Florian Luisier, Kayhan Batmanghelich, Dinei Florencio, Cha Zhang
Dancing Between Success and Failure: Edit-level Simplification Evaluation using SALSA
David Heineman, Yao Dou, Mounica Maddela, Wei Xu
Beyond Shared Vocabulary: Increasing Representational Word Similarities across Languages for Multilingual Machine Translation
Di Wu, Christof Monz