New Language
Research on new language development focuses on efficiently adapting existing large language models (LLMs) and speech recognition systems to handle previously unseen languages, often with limited training data. Current efforts concentrate on techniques like low-rank adaptation (LoRA), instruction tuning, and cross-lingual knowledge transfer, often employing transformer-based architectures and exploring efficient vocabulary management strategies. This work is crucial for expanding access to language technologies for under-resourced languages and improving multilingual applications such as machine translation, speech recognition, and information extraction.
Papers
SambaLingo: Teaching Large Language Models New Languages
Zoltan Csaki, Bo Li, Jonathan Li, Qiantong Xu, Pian Pawakapan, Leon Zhang, Yun Du, Hengyu Zhao, Changran Hu, Urmish Thakker
Is English the New Programming Language? How About Pseudo-code Engineering?
Gian Alexandre Michaelsen, Renato P. dos Santos