Open Source Large Language Model
Open-source large language models (LLMs) aim to provide accessible and customizable alternatives to proprietary models, fostering research and development while addressing concerns about data privacy and vendor lock-in. Current research focuses on adapting these models to specific languages and domains (e.g., Romanian, medicine, finance), improving their reasoning capabilities through techniques like retrieval-augmented generation and mixture-of-experts architectures, and optimizing their deployment efficiency on various hardware. This burgeoning field significantly impacts both the scientific community, by enabling broader participation in LLM research, and practical applications, offering cost-effective and adaptable solutions for diverse tasks ranging from question answering to code generation.
Papers
Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages
Shih-Cheng Huang, Pin-Zu Li, Yu-Chi Hsu, Kuang-Ming Chen, Yu Tung Lin, Shih-Kai Hsiao, Richard Tzong-Han Tsai, Hung-yi Lee
FinGPT: Instruction Tuning Benchmark for Open-Source Large Language Models in Financial Datasets
Neng Wang, Hongyang Yang, Christina Dan Wang