Language Model
Language models are computational systems designed to understand and generate human language, primarily aiming to improve tasks like translation, question answering, and text summarization. Current research focuses on enhancing efficiency (e.g., through novel learning rate schedules and optimized architectures), improving alignment with human preferences (via preference optimization and reward modeling), and addressing biases and limitations (including techniques for mitigating toxicity and enhancing robustness). These advancements have significant implications for various fields, impacting natural language processing research and enabling the development of more powerful and reliable AI applications.
Papers
TinyHelen's First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment
Ke Yang, Volodymyr Kindratenko, ChengXiang Zhai
Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models
Martin Pawelczyk, Lillian Sun, Zhenting Qi, Aounon Kumar, Himabindu Lakkaraju
Chunk-Distilled Language Modeling
Yanhong Li, Karen Livescu, Jiawei Zhou
Rethinking Layer Removal: Preserving Critical Components with Task-Aware Singular Value Decomposition
Kainan Liu, Yong Zhang, Ning Cheng, Zhitao Li, Shaojun Wang, Jing Xiao
A review of faithfulness metrics for hallucination assessment in Large Language Models
Ben Malin, Tatiana Kalganova, Nikoloas Boulgouris
CancerKG.ORG A Web-scale, Interactive, Verifiable Knowledge Graph-LLM Hybrid for Assisting with Optimal Cancer Treatment and Care
Michael Gubanov, Anna Pyayt, Aleksandra Karolak
Detection-Fusion for Knowledge Graph Extraction from Videos
Taniya Das, Louis Mahon, Thomas Lukasiewicz
Training Software Engineering Agents and Verifiers with SWE-Gym
Jiayi Pan, Xingyao Wang, Graham Neubig, Navdeep Jaitly, Heng Ji, Alane Suhr, Yizhe Zhang
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau, Weijian Li, Chenwei Xu, Han Liu, Mladen Kolar
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization
Chia-Yu Hung, Navonil Majumder, Zhifeng Kong, Ambuj Mehrish, Rafael Valle, Bryan Catanzaro, Soujanya Poria
Attributing Culture-Conditioned Generations to Pretraining Corpora
Huihan Li, Arnav Goel, Keyu He, Xiang Ren
UniRS: Unifying Multi-temporal Remote Sensing Tasks through Vision Language Models
Yujie Li, Wenjia Xu, Guangzuo Li, Zijian Yu, Zhiwei Wei, Jiuniu Wang, Mugen Peng
Depression and Anxiety Prediction Using Deep Language Models and Transfer Learning
Tomasz Rutowski, Elizabeth Shriberg, Amir Harati, Yang Lu, Piotr Chlebek, Ricardo Oliveira
HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving
Yang Li, Dong Du, Linfeng Song, Chen Li, Weikang Wang, Tao Yang, Haitao Mi
ICLR: In-Context Learning of Representations
Core Francisco Park, Andrew Lee, Ekdeep Singh Lubana, Yongyi Yang, Maya Okawa, Kento Nishi, Martin Wattenberg, Hidenori Tanaka
Adversarial Negotiation Dynamics in Generative Language Models
Arinbjörn Kolbeinsson, Benedikt Kolbeinsson
On Adversarial Robustness of Language Models in Transfer Learning
Bohdan Turbal, Anastasiia Mazur, Jiaxu Zhao, Mykola Pechenizkiy
Understanding the Impact of Confidence in Retrieval Augmented Generation: A Case Study in the Medical Domain
Shintaro Ozaki, Yuta Kato, Siyuan Feng, Masayo Tomita, Kazuki Hayashi, Ryoma Obara, Masafumi Oyamada, Katsuhiko Hayashi, Hidetaka Kamigaito, Taro Watanabe