Language Model
Language models are computational systems designed to understand and generate human language, primarily aiming to improve tasks like translation, question answering, and text summarization. Current research focuses on enhancing efficiency (e.g., through novel learning rate schedules and optimized architectures), improving alignment with human preferences (via preference optimization and reward modeling), and addressing biases and limitations (including techniques for mitigating toxicity and enhancing robustness). These advancements have significant implications for various fields, impacting natural language processing research and enabling the development of more powerful and reliable AI applications.
Papers
Information Theory of Meaningful Communication
Doron Sivan, Misha Tsodyks
Neurosymbolic Graph Enrichment for Grounded World Models
Stefano De Giorgis, Aldo Gangemi, Alessandro Russo
RedPajama: an Open Dataset for Training Large Language Models
Maurice Weber, Daniel Fu, Quentin Anthony, Yonatan Oren, Shane Adams, Anton Alexandrov, Xiaozhong Lyu, Huu Nguyen, Xiaozhe Yao, Virginia Adams, Ben Athiwaratkun, Rahul Chalamala, Kezhen Chen, Max Ryabinin, Tri Dao, Percy Liang, Christopher Ré, Irina Rish, Ce Zhang
HouseLLM: LLM-Assisted Two-Phase Text-to-Floorplan Generation
Ziyang Zong, Zhaohuan Zhan, Guang Tan
CoMeDi Shared Task: Models as Annotators in Lexical Semantics Disagreements
Zhu Liu, Zhen Hu, Ying Liu
Vision Language Models Are Few-Shot Audio Spectrogram Classifiers
Satvik Dixit, Laurie M. Heller, Chris Donahue
FedCoLLM: A Parameter-Efficient Federated Co-tuning Framework for Large and Small Language Models
Tao Fan, Yan Kang, Guoqiang Ma, Lixin Fan, Kai Chen, Qiang Yang
Reviving Dormant Memories: Investigating Catastrophic Forgetting in Language Models through Rationale-Guidance Difficulty
Huashan Sun, Yang Gao
Addressing Hallucinations in Language Models with Knowledge Graph Embeddings as an Additional Modality
Viktoriia Chekalina, Anton Razzigaev, Elizaveta Goncharova, Andrey Kuznetsov
Preempting Text Sanitization Utility in Resource-Constrained Privacy-Preserving LLM Interactions
Robin Carpentier, Benjamin Zi Hao Zhao, Hassan Jameel Asghar, Dali Kaafar
PALMS: Parallel Adaptive Lasso with Multi-directional Signals for Latent Networks Reconstruction
Zhaoyu Xing, Wei Zhong
Rethinking Thinking Tokens: Understanding Why They Underperform in Practice
Sreeram Vennam, David Valente, David Herel, Ponnurangam Kumaraguru
SayComply: Grounding Field Robotic Tasks in Operational Compliance through Retrieval-Based Language Models
Muhammad Fadhil Ginting, Dong-Ki Kim, Sung-Kyun Kim, Bandi Jai Krishna, Mykel J. Kochenderfer, Shayegan Omidshafiei, Ali-akbar Agha-mohammadi
Steering Language Model Refusal with Sparse Autoencoders
Kyle O'Brien, David Majercak, Xavier Fernandes, Richard Edgar, Jingya Chen, Harsha Nori, Dean Carignan, Eric Horvitz, Forough Poursabzi-Sangde
LP Data Pipeline: Lightweight, Purpose-driven Data Pipeline for Large Language Models
Yungi Kim, Hyunsoo Ha, Seonghoon Yang, Sukyung Lee, Jihoo Kim, Chanjun Park
MEMO-Bench: A Multiple Benchmark for Text-to-Image and Multimodal Large Language Models on Human Emotion Analysis
Yingjie Zhou, Zicheng Zhang, Jiezhang Cao, Jun Jia, Yanwei Jiang, Farong Wen, Xiaohong Liu, Xiongkuo Min, Guangtao Zhai
Debiasing Watermarks for Large Language Models via Maximal Coupling
Yangxinyu Xie, Xiang Li, Tanwi Mallick, Weijie J. Su, Ruixun Zhang
LLäMmlein: Compact and Competitive German-Only Language Models from Scratch
Jan Pfister, Julia Wunderle, Andreas Hotho
SRA-MCTS: Self-driven Reasoning Aurmentation with Monte Carlo Tree Search for Enhanced Code Generation
Bin Xu, Yiguan Lin, Yinghao Li, YangGao
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization
Hongrui Jia, Chaoya Jiang, Haiyang Xu, Wei Ye, Mengfan Dong, Ming Yan, Ji Zhang, Fei Huang, Shikun Zhang