Open Source Large Language Model
Open-source large language models (LLMs) aim to provide accessible and customizable alternatives to proprietary models, fostering research and development while addressing concerns about data privacy and vendor lock-in. Current research focuses on adapting these models to specific languages and domains (e.g., Romanian, medicine, finance), improving their reasoning capabilities through techniques like retrieval-augmented generation and mixture-of-experts architectures, and optimizing their deployment efficiency on various hardware. This burgeoning field significantly impacts both the scientific community, by enabling broader participation in LLM research, and practical applications, offering cost-effective and adaptable solutions for diverse tasks ranging from question answering to code generation.
Papers
Towards Building Multilingual Language Model for Medicine
Pengcheng Qiu, Chaoyi Wu, Xiaoman Zhang, Weixiong Lin, Haicheng Wang, Ya Zhang, Yanfeng Wang, Weidi Xie
Semantic Mirror Jailbreak: Genetic Algorithm Based Jailbreak Prompts Against Open-source LLMs
Xiaoxia Li, Siyuan Liang, Jiyi Zhang, Han Fang, Aishan Liu, Ee-Chien Chang
BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains
Yanis Labrak, Adrien Bazoge, Emmanuel Morin, Pierre-Antoine Gourraud, Mickael Rouvier, Richard Dufour
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset
Shubham Toshniwal, Ivan Moshkov, Sean Narenthiran, Daria Gitman, Fei Jia, Igor Gitman