Language Adaptation
Language adaptation in large language models (LLMs) focuses on efficiently transferring knowledge from high-resource to low-resource languages, improving their performance on diverse tasks and languages. Current research explores techniques like vocabulary adaptation (e.g., modifying Byte-Pair Encoding or using cross-lingual transfers), model merging to mitigate catastrophic forgetting, and efficient training strategies (e.g., using lower precision training). These advancements are crucial for broadening the accessibility and utility of LLMs, fostering inclusivity in natural language processing and enabling applications in a wider range of languages and dialects.
Papers
Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and Benchmarking
Emre Can Acikgoz, Mete Erdogan, Deniz Yuret
MEDVOC: Vocabulary Adaptation for Fine-tuning Pre-trained Language Models on Medical Text Summarization
Gunjan Balde, Soumyadeep Roy, Mainack Mondal, Niloy Ganguly