Language Data
Language data research focuses on developing and improving methods for collecting, processing, and utilizing textual and spoken language data to train and enhance natural language processing (NLP) models. Current research emphasizes addressing data scarcity in low-resource languages, mitigating biases and ethical concerns in data collection, and improving model performance through techniques like multilingual fine-tuning, self-supervised learning, and data augmentation strategies. This work is crucial for advancing NLP capabilities across diverse languages and cultures, impacting applications ranging from machine translation and speech recognition to sentiment analysis and hate speech detection.
Papers
July 28, 2023
June 5, 2023
June 1, 2023
May 26, 2023
May 24, 2023
April 27, 2023
April 10, 2023
April 3, 2023
February 19, 2023
October 20, 2022
October 18, 2022
October 13, 2022
October 7, 2022
September 9, 2022
August 25, 2022
May 4, 2022
April 17, 2022
April 2, 2022
March 28, 2022
January 25, 2022