Lexical Overlap
Lexical overlap, the degree of shared vocabulary between text segments, is a key focus in natural language processing research, particularly concerning its impact on model performance and generalization. Current research investigates how lexical overlap influences various tasks, including machine translation, summarization, and natural language inference, often examining the trade-off between leveraging this overlap for efficiency and avoiding rote learning or biases. This research is crucial for developing more robust and reliable language models, improving the accuracy and explainability of evaluation metrics, and ultimately leading to more effective applications in diverse fields.
Papers
Lexical Repetitions Lead to Rote Learning: Unveiling the Impact of Lexical Overlap in Train and Test Reference Summaries
Prafulla Kumar Choubey, Alexander R. Fabbri, Caiming Xiong, Chien-Sheng Wu
Investigating Hallucinations in Pruned Large Language Models for Abstractive Summarization
George Chrysostomou, Zhixue Zhao, Miles Williams, Nikolaos Aletras