English Dataset
English datasets, crucial for training and evaluating natural language processing (NLP) models, are increasingly being augmented and complemented by multilingual resources to address biases and improve performance in non-English languages. Current research focuses on developing new multilingual benchmarks for various NLP tasks (e.g., question answering, named entity recognition, sentiment analysis), often leveraging large language models (LLMs) for data generation and cross-lingual transfer learning techniques to bridge the resource gap. This work is vital for advancing NLP capabilities beyond English-centric applications and fostering more equitable and inclusive language technologies globally.
Papers
September 25, 2024
September 18, 2024
September 3, 2024
August 18, 2024
July 1, 2024
May 27, 2024
May 20, 2024
May 5, 2024
May 1, 2024
April 30, 2024
April 25, 2024
April 20, 2024
April 18, 2024
April 14, 2024
March 18, 2024
March 11, 2024
March 2, 2024
February 27, 2024
September 15, 2023