Unknown Language
Research on unknown languages focuses on developing and evaluating computational methods to analyze and process text and speech across a wide range of languages, particularly those with limited digital resources. Current efforts concentrate on improving large language models (LLMs) for multilingual tasks, including translation, question answering, and toxicity detection, often employing techniques like self-supervised learning, preference tuning, and multilingual feedback mechanisms. This work is crucial for advancing natural language processing capabilities globally, enabling more equitable access to technology and fostering deeper cross-cultural understanding within the scientific community and various practical applications.
Papers
Be My Donor. Transfer the NLP Datasets Between the Languages Using LLM
Dmitrii Popov, Egor Terentev, Igor Buyanov
CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models
Shangda Wu, Yashan Wang, Ruibin Yuan, Zhancheng Guo, Xu Tan, Ge Zhang, Monan Zhou, Jing Chen, Xuefeng Mu, Yuejie Gao, Yuanliang Dong, Jiafeng Liu, Xiaobing Li, Feng Yu, Maosong Sun