Large Language Model
Large language models (LLMs) are sophisticated AI systems designed to process and generate human-like text, aiming to improve various natural language processing tasks. Current research focuses on enhancing LLM safety, efficiency (through techniques like quantization and optimized decoding), and fairness, as well as improving their ability to perform complex reasoning and handle diverse instructions. These advancements are significant because they address critical limitations in current LLMs and pave the way for broader applications across diverse fields, including healthcare, legal tech, and autonomous systems.
Papers
KnowledgeSG: Privacy-Preserving Synthetic Text Generation with Knowledge Distillation from Server
Wenhao Wang, Xiaoyu Liang, Rui Ye, Jingyi Chai, Siheng Chen, Yanfeng Wang
Enhancing Temporal Modeling of Video LLMs via Time Gating
Zi-Yuan Hu, Yiwu Zhong, Shijia Huang, Michael R. Lyu, Liwei Wang
Copiloting Diagnosis of Autism in Real Clinical Scenarios via LLMs
Yi Jiang, Qingyang Shen, Shuzhong Lai, Shunyu Qi, Qian Zheng, Lin Yao, Yueming Wang, Gang Pan
Scaling Laws Across Model Architectures: A Comparative Analysis of Dense and MoE Models in Large Language Models
Siqi Wang, Zhengyu Chen, Bei Li, Keqing He, Min Zhang, Jingang Wang
DecorateLM: Data Engineering through Corpus Rating, Tagging, and Editing with Language Models
Ranchi Zhao, Zhen Leng Thai, Yifan Zhang, Shengding Hu, Yunqi Ba, Jie Zhou, Jie Cai, Zhiyuan Liu, Maosong Sun
Vector-ICL: In-context Learning with Continuous Vector Representations
Yufan Zhuang, Chandan Singh, Liyuan Liu, Jingbo Shang, Jianfeng Gao
Stereotype or Personalization? User Identity Biases Chatbot Recommendations
Anjali Kantharuban, Jeremiah Milbauer, Emma Strubell, Graham Neubig
Chain-of-Thoughts for Molecular Understanding
Yunhui Jang, Jaehyung Kim, Sungsoo Ahn
Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition
Zheyang Xiong, Ziyang Cai, John Cooper, Albert Ge, Vasilis Papageorgiou, Zack Sifakis, Angeliki Giannou, Ziqian Lin, Liu Yang, Saurabh Agarwal, Grigorios G Chrysos, Samet Oymak, Kangwook Lee, Dimitris Papailiopoulos
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
Zilin Xiao, Hongming Zhang, Tao Ge, Siru Ouyang, Vicente Ordonez, Dong Yu
Adaptation Odyssey in LLMs: Why Does Additional Pretraining Sometimes Fail to Improve?
Fırat Öncel, Matthias Bethge, Beyza Ermis, Mirco Ravanelli, Cem Subakan, Çağatay Yıldız
Chain and Causal Attention for Efficient Entity Tracking
Erwan Fagnou, Paul Caillon, Blaise Delattre, Alexandre Allauzen
Fill In The Gaps: Model Calibration and Generalization with Synthetic Data
Yang Ba, Michelle V. Mancenido, Rong Pan
What makes your model a low-empathy or warmth person: Exploring the Origins of Personality in LLMs
Shu Yang, Shenzhe Zhu, Ruoxuan Bao, Liang Liu, Yu Cheng, Lijie Hu, Mengdi Li, Di Wang
Superficial Safety Alignment Hypothesis
Jianwei Li, Jung-Eun Kim
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Kaiyue Wen, Huaqing Zhang, Hongzhou Lin, Jingzhao Zhang
Aligning LLMs to Be Robust Against Prompt Injection
Sizhe Chen, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, Chuan Guo
Better than Your Teacher: LLM Agents that learn from Privileged AI Feedback
Sanjiban Choudhury, Paloma Sodhi
Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models
Fei Wang, Ninareh Mehrabi, Palash Goyal, Rahul Gupta, Kai-Wei Chang, Aram Galstyan
PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs
Mengzhao Chen, Yi Liu, Jiahao Wang, Yi Bin, Wenqi Shao, Ping Luo