Large Language Model
Large language models (LLMs) are sophisticated AI systems designed to process and generate human-like text, aiming to improve various natural language processing tasks. Current research focuses on enhancing LLM safety, efficiency (through techniques like quantization and optimized decoding), and fairness, as well as improving their ability to perform complex reasoning and handle diverse instructions. These advancements are significant because they address critical limitations in current LLMs and pave the way for broader applications across diverse fields, including healthcare, legal tech, and autonomous systems.
Papers
Fine-Tuning LLMs for Reliable Medical Question-Answering Services
Ali Anaissi, Ali Braytee, Junaid Akram
CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts
Zhenpeng Su, Xing Wu, Zijia Lin, Yizhe Xiong, Minxuan Lv, Guangyuan Ma, Hui Chen, Songlin Hu, Guiguang Ding
Are Language Model Logits Calibrated?
Charles Lovering, Michael Krumdick, Viet Dac Lai, Nilesh Kumar, Varshini Reddy, Rik Koncel-Kedziorski, Chris Tanner
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Yu Zhao, Alessio Devoto, Giwon Hong, Xiaotang Du, Aryo Pradipta Gema, Hongru Wang, Kam-Fai Wong, Pasquale Minervini
Large Language Models for Cross-lingual Emotion Detection
Ram Mohan Rao Kadiyala
Self-Explained Keywords Empower Large Language Models for Code Generation
Lishui Fan, Mouxiang Chen, Zhongxin Liu
CausalGraph2LLM: Evaluating LLMs for Causal Queries
Ivaxi Sheth, Bahare Fatemi, Mario Fritz
Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs
Xin Ma, Yang Liu, Jingjing Liu, Xiaoxu Ma
LLM4GRN: Discovering Causal Gene Regulatory Networks with LLMs -- Evaluation through Synthetic Data Generation
Tejumade Afonja, Ivaxi Sheth, Ruta Binkyte, Waqar Hanif, Thomas Ulas, Matthias Becker, Mario Fritz
Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive Decoding
Derong Xu, Ziheng Zhang, Zhihong Zhu, Zhenxi Lin, Qidong Liu, Xian Wu, Tong Xu, Xiangyu Zhao, Yefeng Zheng, Enhong Chen
DomainSum: A Hierarchical Benchmark for Fine-Grained Domain Shift in Abstractive Text Summarization
Haohan Yuan, Haopeng Zhang
Revealing and Mitigating the Local Pattern Shortcuts of Mamba
Wangjie You, Zecheng Tang, Juntao Li, Lili Yao, Min Zhang
Long Term Memory: The Foundation of AI Self-Evolution
Xun Jiang, Feng Li, Han Zhao, Jiaying Wang, Jun Shao, Shihao Xu, Shu Zhang, Weiling Chen, Xavier Tang, Yize Chen, Mengyue Wu, Weizhi Ma, Mengdi Wang, Tianqiao Chen
Understanding and Alleviating Memory Consumption in RLHF for LLMs
Jin Zhou, Hanmei Yang, Steven (Jiaxun) Tang, Mingcan Xiang, Hui Guan, Tongping Liu
Boosting Jailbreak Transferability for Large Language Models
Hanqing Liu, Lifeng Zhou, Huanqian Yan
Guardians of Discourse: Evaluating LLMs on Multilingual Offensive Language Detection
Jianfei He, Lilin Wang, Jiaying Wang, Zhenyu Liu, Hongbin Na, Zimu Wang, Wei Wang, Qi Chen
On The Global Convergence Of Online RLHF With Neural Parametrization
Mudit Gaur, Amrit Singh Bedi, Raghu Pasupathy, Vaneet Aggarwal
A Comprehensive Survey of Datasets, Theories, Variants, and Applications in Direct Preference Optimization
Wenyi Xiao, Zechuan Wang, Leilei Gan, Shuai Zhao, Wanggui He, Luu Anh Tuan, Long Chen, Hao Jiang, Zhou Zhao, Fei Wu
Stacking Small Language Models for Generalizability
Laurence Liang
Pruning Foundation Models for High Accuracy without Retraining
Pu Zhao, Fei Sun, Xuan Shen, Pinrui Yu, Zhenglun Kong, Yanzhi Wang, Xue Lin