High Resource Language
High-resource language (HRL) research focuses on addressing the significant performance gap between natural language processing (NLP) models trained on dominant languages (like English and Chinese) and those trained on low-resource languages (LRLs). Current research emphasizes developing and adapting large language models (LLMs), often employing techniques like multilingual fine-tuning, transfer learning from HRLs, and data augmentation strategies to improve performance on LRLs across various tasks such as machine translation, question answering, and sentiment analysis. This work is crucial for promoting linguistic diversity and inclusivity in AI, ensuring equitable access to advanced language technologies for all speakers globally.
Papers
Bridge-Coder: Unlocking LLMs' Potential to Overcome Language Gaps in Low-Resource Code
Jipeng Zhang, Jianshu Zhang, Yuanzhe Li, Renjie Pi, Rui Pan, Runtao Liu, Ziqiang Zheng, Tong Zhang
LLMs for Extremely Low-Resource Finno-Ugric Languages
Taido Purason, Hele-Andra Kuulmets, Mark Fishel
GrammaMT: Improving Machine Translation with Grammar-Informed In-Context Learning
Rita Ramos, Everlyn Asiko Chimoto, Maartje ter Hoeve, Natalie Schluter
Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch
Donglin Di, Weinan Zhang, Yue Zhang, Fanglin Wang
Monolingual and Multilingual Misinformation Detection for Low-Resource Languages: A Comprehensive Survey
Xinyu Wang, Wenbo Zhang, Sarah Rajtmajer