Commercial Large Language Model
Commercial large language models (LLMs) are powerful AI systems designed for various natural language processing tasks, with research focusing on improving their alignment with human preferences, evaluating their performance across diverse domains (e.g., medical summarization, language proficiency assessment), and mitigating biases and vulnerabilities like jailbreaks. Current research investigates methods for enhancing reliability and reducing costs, including techniques like instruction tuning, parameter-efficient fine-tuning, and token compression for retrieval-augmented models. These advancements have significant implications for various fields, enabling automated evaluation, improved program repair, and more equitable access to AI-powered tools, while also highlighting the need for robust auditing and bias mitigation strategies.
Papers
GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy
Jan Batzner, Volker Stocker, Stefan Schmid, Gjergji Kasneci
Closing the gap between open-source and commercial large language models for medical evidence summarization
Gongbo Zhang, Qiao Jin, Yiliang Zhou, Song Wang, Betina R. Idnay, Yiming Luo, Elizabeth Park, Jordan G. Nestor, Matthew E. Spotnitz, Ali Soroush, Thomas Campion, Zhiyong Lu, Chunhua Weng, Yifan Peng