English Dataset
English datasets, crucial for training and evaluating natural language processing (NLP) models, are increasingly being augmented and complemented by multilingual resources to address biases and improve performance in non-English languages. Current research focuses on developing new multilingual benchmarks for various NLP tasks (e.g., question answering, named entity recognition, sentiment analysis), often leveraging large language models (LLMs) for data generation and cross-lingual transfer learning techniques to bridge the resource gap. This work is vital for advancing NLP capabilities beyond English-centric applications and fostering more equitable and inclusive language technologies globally.
Papers
September 15, 2023
March 30, 2023
December 23, 2022
December 14, 2022
October 25, 2022
October 10, 2022
July 18, 2022
May 20, 2022
May 6, 2022
April 19, 2022
March 30, 2022
March 24, 2022
March 4, 2022
January 26, 2022
December 18, 2021