Open Source Language Model
Open-source language models (LLMs) aim to democratize access to and research on powerful language AI by making model weights, training data, and code publicly available. Current research focuses on improving model performance through techniques like fine-tuning with specialized datasets (e.g., for medical summarization or mathematical reasoning), model merging to combine strengths, and addressing security vulnerabilities like backdoor attacks. This open approach fosters collaboration, accelerates innovation, and enables applications across diverse fields, including healthcare, education, and scientific research, while also raising important questions about data bias, model safety, and ethical considerations.
Papers
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs
Zimu Lu, Aojun Zhou, Houxing Ren, Ke Wang, Weikang Shi, Junting Pan, Mingjie Zhan, Hongsheng Li
CodeS: Towards Building Open-source Language Models for Text-to-SQL
Haoyang Li, Jing Zhang, Hanbing Liu, Ju Fan, Xiaokang Zhang, Jun Zhu, Renjie Wei, Hongyan Pan, Cuiping Li, Hong Chen
Scalable Extraction of Training Data from (Production) Language Models
Milad Nasr, Nicholas Carlini, Jonathan Hayase, Matthew Jagielski, A. Feder Cooper, Daphne Ippolito, Christopher A. Choquette-Choo, Eric Wallace, Florian Tramèr, Katherine Lee
The Claire French Dialogue Dataset
Julie Hunter, Jérôme Louradour, Virgile Rennard, Ismaïl Harrando, Guokan Shang, Jean-Pierre Lorré
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Nolan Dey, Daria Soboleva, Faisal Al-Khateeb, Bowen Yang, Ribhu Pathria, Hemant Khachane, Shaheer Muhammad, Zhiming, Chen, Robert Myers, Jacob Robert Steeves, Natalia Vassilieva, Marvin Tom, Joel Hestness
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
Guan Wang, Sijie Cheng, Xianyuan Zhan, Xiangang Li, Sen Song, Yang Liu