Large Language Model
Large language models (LLMs) are sophisticated AI systems designed to process and generate human-like text, aiming to improve various natural language processing tasks. Current research focuses on enhancing LLM safety, efficiency (through techniques like quantization and optimized decoding), and fairness, as well as improving their ability to perform complex reasoning and handle diverse instructions. These advancements are significant because they address critical limitations in current LLMs and pave the way for broader applications across diverse fields, including healthcare, legal tech, and autonomous systems.
Papers
Don't Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks
Brian J Chan, Chao-Ting Chen, Jui-Hung Cheng, Hen-Hsen Huang
In-context Continual Learning Assisted by an External Continual Learner
Saleh Momeni, Sahisnu Mazumder, Zixuan Ke, Bing Liu
Mitigating Social Bias in Large Language Models: A Multi-Objective Approach within a Multi-Agent Framework
Zhenjie Xu (1), Wenqing Chen (1), Yi Tang (1), Xuanying Li (2), Cheng Hu (1), Zhixuan Chu (3), Kui Ren (3), Zibin Zheng (1), Zhichao Lu (4) ((1) School of Software Engineering, Sun Yat-sen University, (2) School of Physics and Astronomy, Sun Yat-sen University, (3) School of Cyber Science and Technology, Zhejiang University, (4) Department of Computer Science, City University of Hong Kong)
TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
Junjie Ye, Yilong Wu, Sixian Li, Yuming Yang, Tao Gui, Qi Zhang, Xuanjing Huang, Peng Wang, Zhongchao Shi, Jianping Fan, Zhengyin Du
Continual Learning Using Only Large Language Model Prompting
Jiabao Qiu, Zixuan Ke, Bing Liu
Why Do Large Language Models (LLMs) Struggle to Count Letters?
Tairan Fu, Raquel Ferrando, Javier Conde, Carlos Arriaga, Pedro Reviriego
Time Will Tell: Timing Side Channels via Output Token Count in Large Language Models
Tianchen Zhang, Gururaj Saileshwar, David Lie
Systematic Evaluation of Long-Context LLMs on Financial Concepts
Lavanya Gupta, Saket Sharma, Yiyun Zhao
MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark
Qihao Zhao, Yangyu Huang, Tengchao Lv, Lei Cui, Qinzheng Sun, Shaoguang Mao, Xin Zhang, Ying Xin, Qiufeng Yin, Scarlett Li, Furu Wei
HPC-Coder-V2: Studying Code LLMs Across Low-Resource Parallel Languages
Aman Chaturvedi, Daniel Nichols, Siddharth Singh, Abhinav Bhatele
Adaptive Pruning for Large Language Models with Structural Importance Awareness
Haotian Zheng, Jinke Ren, Yushan Sun, Ruichen Zhang, Wenbo Zhang, Zhen Li, Dusit Niyato, Shuguang Cui, Yatong Han
Associative memory inspires improvements for in-context learning using a novel attention residual stream architecture
Thomas F Burns, Tomoki Fukai, Christopher J Earls
ConfliBERT: A Language Model for Political Conflict
Patrick T. Brandt, Sultan Alsarra, Vito J. D`Orazio, Dagmar Heintze, Latifur Khan, Shreyas Meher, Javier Osorio, Marcus Sianan
LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps
Felix Friedrich, Simone Tedeschi, Patrick Schramowski, Manuel Brack, Roberto Navigli, Huu Nguyen, Bo Li, Kristian Kersting
Knowledge Injection via Prompt Distillation
Kalle Kujanpää, Harri Valpola, Alexander Ilin
Dehallucinating Parallel Context Extension for Retrieval-Augmented Generation
Zexiong Ma, Shengnan An, Zeqi Lin, Yanzhen Zou, Jian-Guang Lou, Bing Xie
Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling
Junyi Li, Hwee Tou Ng
DS$^2$-ABSA: Dual-Stream Data Synthesis with Label Refinement for Few-Shot Aspect-Based Sentiment Analysis
Hongling Xu, Yice Zhang, Qianlong Wang, Ruifeng Xu
Helping LLMs Improve Code Generation Using Feedback from Testing and Static Analysis
Greta Dolcetti, Vincenzo Arceri, Eleonora Iotti, Sergio Maffeis, Agostino Cortesi, Enea Zaffanella
ResoFilter: Rine-grained Synthetic Data Filtering for Large Language Models through Data-Parameter Resonance Analysis
Zeao Tu, Xiangdi Meng, Yu He, Zihan Yao, Tianyu Qi, Jun Liu, Ming Li