Large Language Model
Large language models (LLMs) are sophisticated AI systems designed to process and generate human-like text, aiming to improve various natural language processing tasks. Current research focuses on enhancing LLM safety, efficiency (through techniques like quantization and optimized decoding), and fairness, as well as improving their ability to perform complex reasoning and handle diverse instructions. These advancements are significant because they address critical limitations in current LLMs and pave the way for broader applications across diverse fields, including healthcare, legal tech, and autonomous systems.
Papers
CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models
Zi Gong, Hang Yu, Cong Liao, Bingchang Liu, Chaoyu Chen, Jianguo Li
Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering
Joris Postmus, Steven Abreu
Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with Situation Puzzles
Qi Chen, Bowen Zhang, Gang Wang, Qi Wu
Improving Data Efficiency via Curating LLM-Driven Rating Systems
Jinlong Pang, Jiaheng Wei, Ankit Parag Shah, Zhaowei Zhu, Yaxuan Wang, Chen Qian, Yang Liu, Yujia Bao, Wei Wei
Guaranteed Generation from Large Language Models
Minbeom Kim, Thibaut Thonet, Jos Rozen, Hwaran Lee, Kyomin Jung, Marc Dymetman
Learning Evolving Tools for Large Language Models
Guoxin Chen, Zhong Zhang, Xin Cong, Fangda Guo, Yesai Wu, Yankai Lin, Wenzheng Feng, Yasheng Wang
Dissecting Fine-Tuning Unlearning in Large Language Models
Yihuai Hong, Yuelin Zou, Lijie Hu, Ziqian Zeng, Di Wang, Haiqin Yang
TuringQ: Benchmarking AI Comprehension in Theory of Computation
Pardis Sadat Zahraei, Ehsaneddin Asgari
Chip-Tuning: Classify Before Language Models Say
Fangwei Zhu, Dian Li, Jiajun Huang, Gang Liu, Hui Wang, Zhifang Sui
TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training
Wanchao Liang, Tianyu Liu, Less Wright, Will Constable, Andrew Gu, Chien-Chin Huang, Iris Zhang, Wei Feng, Howard Huang, Junjie Wang, Sanket Purandare, Gokul Nadathur, Stratos Idreos
AuditWen:An Open-Source Large Language Model for Audit
Jiajia Huang, Haoran Zhu, Chao Xu, Tianming Zhan, Qianqian Xie, Jimin Huang
Composite Learning Units: Generalized Learning Beyond Parameter Updates to Transform LLMs into Adaptive Reasoners
Santosh Kumar Radha, Oktay Goktas
Large Language Model Compression with Neural Architecture Search
Rhea Sanjay Sukthanker, Benedikt Staffler, Frank Hutter, Aaron Klein
Recent advancements in LLM Red-Teaming: Techniques, Defenses, and Ethical Considerations
Tarun Raheja, Nilay Pochhi
MaD-Scientist: AI-based Scientist solving Convection-Diffusion-Reaction Equations Using Massive PINN-Based Prior Data
Mingu Kang, Dongseok Lee, Woojin Cho, Jaehyeon Park, Kookjin Lee, Anthony Gruber, Youngjoon Hong, Noseong Park
DisasterQA: A Benchmark for Assessing the performance of LLMs in Disaster Response
Rajat Rawat
Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs
Ruijia Niu, Dongxia Wu, Rose Yu, Yi-An Ma
A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models
Cong Guo, Feng Cheng, Zhixu Du, James Kiessling, Jonathan Ku, Shiyu Li, Ziru Li, Mingyuan Ma, Tergel Molom-Ochir, Benjamin Morris, Haoxuan Shan, Jingwei Sun, Yitu Wang, Chiyue Wei, Xueying Wu, Yuhao Wu, Hao Frank Yang, Jingyang Zhang, Junyao Zhang, Qilin Zheng, Guanglei Zhou, Hai (Helen)Li, Yiran Chen
SpaLLM: Unified Compressive Adaptation of Large Language Models with Sketching
Tianyi Zhang, Junda Su, Oscar Wu, Zhaozhuo Xu, Anshumali Shrivastava
ToolBridge: An Open-Source Dataset to Equip LLMs with External Tool Capabilities
Zhenchao Jin, Mengchen Liu, Dongdong Chen, Lingting Zhu, Yunsheng Li, Lequan Yu