Large Language Model
Large language models (LLMs) are sophisticated AI systems designed to process and generate human-like text, aiming to improve various natural language processing tasks. Current research focuses on enhancing LLM safety, efficiency (through techniques like quantization and optimized decoding), and fairness, as well as improving their ability to perform complex reasoning and handle diverse instructions. These advancements are significant because they address critical limitations in current LLMs and pave the way for broader applications across diverse fields, including healthcare, legal tech, and autonomous systems.
Papers
Distilling Multi-modal Large Language Models for Autonomous Driving
Deepti Hegde, Rajeev Yasarla, Hong Cai, Shizhong Han, Apratim Bhattacharyya, Shweta Mahajan, Litian Liu, Risheek Garrepalli, Vishal M. Patel, Fatih Porikli
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking
Zekun Xi, Wenbiao Yin, Jizhan Fang, Jialong Wu, Runnan Fang, Ningyu Zhang, Jiang Yong, Pengjun Xie, Fei Huang, Huajun Chen
The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models
Jonathan Katzy, Razvan Mihai Popescu, Arie van Deursen, Maliheh Izadi
LLM-Based Routing in Mixture of Experts: A Novel Framework for Trading
Kuan-Ming Liu (1), Ming-Chih Lo (2) ((1) National Chengchi University, College of Commerce, (2) National Yang Ming Chiao Tung University, College of Computer Science)
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework
Yushen Lin, Ruichen Zhang, Wenqi Huang, Kaidi Wang, Zhiguo Ding, Daniel K. C. So, Dusit Niyato
Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Chaoqi Wang, Zhuokai Zhao, Yibo Jiang, Zhaorun Chen, Chen Zhu, Yuxin Chen, Jiayi Liu, Lizhu Zhang, Xiangjun Fan, Hao Ma, Sinong Wang
Confidence Estimation for Error Detection in Text-to-SQL Systems
Oleg Somov, Elena Tutubalina
Augmenting a Large Language Model with a Combination of Text and Visual Data for Conversational Visualization of Global Geospatial Data
Omar Mena, Alexandre Kouyoumdjian, Lonni Besançon, Michael Gleicher, Ivan Viola, Anders Ynnerman
A Survey on Responsible LLMs: Inherent Risk, Malicious Use, and Mitigation Strategy
Huandong Wang, Wenjie Fu, Yingzhou Tang, Zhilong Chen, Yuxi Huang, Jinghua Piao, Chen Gao, Fengli Xu, Tao Jiang, Yong Li
FASP: Fast and Accurate Structured Pruning of Large Language Models
Hanyu Hu, Pengxiang Zhao, Ping Li, Yi Zheng, Zhefeng Wang, Xiaoming Yuan
MoE$^2$: Optimizing Collaborative Inference for Edge Large Language Models
Lyudong Jin, Yanning Zhang, Yanhan Li, Shurong Wang, Howard H. Yang, Jian Wu, Meng Zhang
Aligning Instruction Tuning with Pre-training
Yiming Liang, Tianyu Zheng, Xinrun Du, Ge Zhang, Xingwei Qu, Xiang Yue, Chujie Zheng, Jiaheng Liu, Lei Ma, Wenhu Chen, Guoyin Wang, Zhaoxiang Zhang, Wenhao Huang, Jiajun Zhang
Rational Tuning of LLM Cascades via Probabilistic Modeling
Michael J. Zellinger, Matt Thomson
A Study of In-Context-Learning-Based Text-to-SQL Errors
Jiawei Shen, Chengcheng Wan, Ruoyi Qiao, Jiazhen Zou, Hang Xu, Yuchen Shao, Yueling Zhang, Weikai Miao, Geguang Pu
To Retrieve or Not to Retrieve? Uncertainty Detection for Dynamic Retrieval Augmented Generation
Kaustubh D. Dhole
Large Language Model is Secretly a Protein Sequence Optimizer
Yinkai Wang, Jiaxing He, Yuanqi Du, Xiaohui Chen, Jianan Canal Li, Li-Ping Liu, Xiaolin Xu, Soha Hassoun
Perspective Transition of Large Language Models for Solving Subjective Tasks
Xiaolong Wang, Yuanchi Zhang, Ziyue Wang, Yuzhuang Xu, Fuwen Luo, Yile Wang, Peng Li, Yang Liu
Foundations of Large Language Models
Tong Xiao, Jingbo Zhu
FineMedLM-o1: Enhancing the Medical Reasoning Ability of LLM from Supervised Fine-Tuning to Test-Time Training
Hongzhou Yu, Tianhao Cheng, Ying Cheng, Rui Feng