Large Language Model
Large language models (LLMs) are sophisticated AI systems designed to process and generate human-like text, aiming to improve various natural language processing tasks. Current research focuses on enhancing LLM safety, efficiency (through techniques like quantization and optimized decoding), and fairness, as well as improving their ability to perform complex reasoning and handle diverse instructions. These advancements are significant because they address critical limitations in current LLMs and pave the way for broader applications across diverse fields, including healthcare, legal tech, and autonomous systems.
Papers
Adaptive Circuit Behavior and Generalization in Mechanistic Interpretability
Jatin Nainani, Sankaran Vaidyanathan, AJ Yeung, Kartik Gupta, David Jensen
SAGEval: The frontiers of Satisfactory Agent based NLG Evaluation for reference-free open-ended text
Reshmi Ghosh, Tianyi Yao, Lizzy Chen, Sadid Hasan, Tianwei Chen, Dario Bernal, Huitian Jiao, H M Sajjad Hossain
Investigating Factuality in Long-Form Text Generation: The Roles of Self-Known and Self-Unknown
Lifu Tu, Rui Meng, Shafiq Joty, Yingbo Zhou, Semih Yavuz
Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format
Chao Fang, Man Shi, Robin Geens, Arne Symons, Zhongfeng Wang, Marian Verhelst
Efficient and Private: Memorisation under differentially private parameter-efficient fine-tuning in language models
Olivia Ma, Jonathan Passerat-Palmbach, Dmitrii Usynin
LeMoLE: LLM-Enhanced Mixture of Linear Experts for Time Series Forecasting
Lingzheng Zhang, Lifeng Shen, Yimin Zheng, Shiyuan Piao, Ziyue Li, Fugee Tsung
LoRA-Mini : Adaptation Matrices Decomposition and Selective Training
Ayush Singh, Rajdeep Aher, Shivank Garg
A Method for Building Large Language Models with Predefined KV Cache Capacity
Zhonghua Yi, Ge Niu, Lei Wang, Wei Tang, Liqiu Zhang
TableTime: Reformulating Time Series Classification as Zero-Shot Table Understanding via Large Language Models
Jiahao Wang, Mingyue Cheng, Qingyang Mao, Qi Liu, Feiyang Xu, Xin Li, Enhong Chen
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
Xiaoye Qu, Daize Dong, Xuyang Hu, Tong Zhu, Weigao Sun, Yu Cheng
Text-to-SQL Calibration: No Need to Ask -- Just Rescale Model Probabilities
Ashwin Ramachandran, Sunita Sarawagi
Multi-label Sequential Sentence Classification via Large Language Model
Mengfei Lan, Lecheng Zheng, Shufan Ming, Halil Kilicoglu
Do LLMs Agree on the Creativity Evaluation of Alternative Uses?
Abdullah Al Rabeyah, Fabrício Góes, Marco Volpe, Talles Medeiros
Reassessing Layer Pruning in LLMs: New Insights and Methods
Yao Lu, Hao Cheng, Yujie Fang, Zeyu Wang, Jiaheng Wei, Dongwei Xu, Qi Xuan, Xiaoniu Yang, Zhaowei Zhu
ChemSafetyBench: Benchmarking LLM Safety on Chemistry Domain
Haochen Zhao, Xiangru Tang, Ziran Yang, Xiao Han, Xuanzhi Feng, Yueqing Fan, Senhao Cheng, Di Jin, Yilun Zhao, Arman Cohan, Mark Gerstein
Large Language Model with Region-guided Referring and Grounding for CT Report Generation
Zhixuan Chen, Yequan Bie, Haibo Jin, Hao Chen
Towards Robust Evaluation of Unlearning in LLMs via Data Transformations
Abhinav Joshi, Shaswati Saha, Divyaksh Shukla, Sriram Vema, Harsh Jhamtani, Manas Gaur, Ashutosh Modi
PPLqa: An Unsupervised Information-Theoretic Quality Metric for Comparing Generative Large Language Models
Gerald Friedland, Xin Huang, Yueying Cui, Vishaal Kapoor, Ashish Khetan, Sanjiv Das
Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion
Samarth N Ramesh, Zhixue Zhao
Sycophancy in Large Language Models: Causes and Mitigations
Lars Malmqvist