Multilingual Capability
Multilingual capability in large language models (LLMs) focuses on developing models that perform well across many languages, addressing the current dominance of English-centric systems. Research actively explores techniques like multilingual instruction tuning, continual pre-training, and manipulation of internal language representations to improve performance, particularly for low-resource languages, while mitigating issues like catastrophic forgetting and bias. This field is crucial for broadening AI accessibility globally and fostering equitable access to advanced AI services, impacting both scientific understanding of language representation and the development of inclusive real-world applications.
Papers
Nemotron-4 15B Technical Report
Jupinder Parmar, Shrimai Prabhumoye, Joseph Jennings, Mostofa Patwary, Sandeep Subramanian, Dan Su, Chen Zhu, Deepak Narayanan, Aastha Jhunjhunwala, Ayush Dattagupta, Vibhu Jawa, Jiwei Liu, Ameya Mahabaleshwarkar, Osvald Nitski, Annika Brundyn, James Maki, Miguel Martinez, Jiaxuan You, John Kamalu, Patrick LeGresley, Denys Fridman, Jared Casper, Ashwath Aithal, Oleksii Kuchaiev, Mohammad Shoeybi, Jonathan Cohen, Bryan Catanzaro
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models
Tianyi Tang, Wenyang Luo, Haoyang Huang, Dongdong Zhang, Xiaolei Wang, Xin Zhao, Furu Wei, Ji-Rong Wen
Having Beer after Prayer? Measuring Cultural Bias in Large Language Models
Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models
Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Jungo Kasai, David R. Mortensen, Noah A. Smith, Yulia Tsvetkov