Language Model

Language models are computational systems designed to understand and generate human language, primarily aiming to improve tasks like translation, question answering, and text summarization. Current research focuses on enhancing efficiency (e.g., through novel learning rate schedules and optimized architectures), improving alignment with human preferences (via preference optimization and reward modeling), and addressing biases and limitations (including techniques for mitigating toxicity and enhancing robustness). These advancements have significant implications for various fields, impacting natural language processing research and enabling the development of more powerful and reliable AI applications.

Papers

December 31, 2024

TinyHelen's First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment
Ke Yang, Volodymyr Kindratenko, ChengXiang Zhai
Language Model Training Data Pre Training Programming Language Text Based Environment Irreducible Curriculum
Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models
Martin Pawelczyk, Lillian Sun, Zhenting Qi, Aounon Kumar, Himabindu Lakkaraju
Language Model User Trust Weak to Strong
Chunk-Distilled Language Modeling
Yanhong Li, Karen Livescu, Jiawei Zhou
Language Model
Rethinking Layer Removal: Preserving Critical Components with Task-Aware Singular Value Decomposition
Kainan Liu, Yong Zhang, Ning Cheng, Zhitao Li, Shaojun Wang, Jing Xiao
Language Model Singular Value Decomposition Task Aware Layer Pruning Singular Value Key Component
A review of faithfulness metrics for hallucination assessment in Large Language Models
Ben Malin, Tatiana Kalganova, Nikoloas Boulgouris
Language Model Narrative Review Question Answering Multiple Choice Faithfulness Test Faithfulness Metric Open Ended Generation Augmented Generation
CancerKG.ORG A Web-scale, Interactive, Verifiable Knowledge Graph-LLM Hybrid for Assisting with Optimal Cancer Treatment and Care
Michael Gubanov, Anna Pyayt, Aleksandra Karolak
Language Model Knowledge Graph Retrieval Augmented Generation Information Retrieval Efficient Hybrid Medical Text Optimal Treatment Web Scale

December 30, 2024

Detection-Fusion for Knowledge Graph Extraction from Videos
Taniya Das, Louis Mahon, Thomas Lukasiewicz
Language Model Knowledge Graph Gameplay Video Video Understanding Semantic Content Natural Language Annotation Fusion Detection
Training Software Engineering Agents and Verifiers with SWE-Gym
Jiayi Pan, Xingyao Wang, Graham Neubig, Navdeep Jaitly, Heng Ji, Alane Suhr, Yizhe Zhang
Language Model Agent Smith Open Source Agent Trajectory State of the Art Verifier Agent Tuning MR iNet Gym
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau, Weijian Li, Chenwei Xu, Han Liu, Mladen Kolar
Language Model Full Model Human Language Model Training Model Parallelism Adaptive Batch Size Optimal Batch
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization
Chia-Yu Hung, Navonil Majumder, Zhifeng Kong, Ambuj Mehrish, Rafael Valle, Bryan Catanzaro, Soujanya Poria
Language Model Preference Optimization Audio Generation Matching Accuracy Faithful Explanation
Attributing Culture-Conditioned Generations to Pretraining Corpora
Huihan Li, Arnav Goel, Keyu He, Xiang Ren
Language Model Large Corpus Limited Memorization Cultural Bias
UniRS: Unifying Multi-temporal Remote Sensing Tasks through Vision Language Models
Yujie Li, Wenjia Xu, Guangzuo Li, Zijian Yu, Zhiwei Wei, Jiuniu Wang, Mugen Peng
Language Model Vision Language Model Vision Paper Bi Temporal Image Multi Temporal Remote Sensing Image Unified Visual
Depression and Anxiety Prediction Using Deep Language Models and Transfer Learning
Tomasz Rutowski, Elizabeth Shriberg, Amir Harati, Yang Lu, Piotr Chlebek, Ricardo Oliveira
Language Model Transfer Learning Speech Analysis Mental Health Depression Symptom Linguistic Cue Anxiety Disorder
HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving
Yang Li, Dong Du, Linfeng Song, Chen Li, Weikang Wang, Tao Yang, Haitao Mi
Language Model Search Query Theorem Proving Data Synthesis Tree Search Interactive Theorem

December 29, 2024

ICLR: In-Context Learning of Representations
Core Francisco Park, Andrew Lee, Ekdeep Singh Lubana, Yongyi Yang, Maya Okawa, Kento Nishi, Martin Wattenberg, Hidenori Tanaka
Language Model Context Learning Meaningful Representation Context Example Semantic Role Context Graph
Adversarial Negotiation Dynamics in Generative Language Models
Arinbjörn Kolbeinsson, Benedikt Kolbeinsson
Language Model Generative Language Model Adversarial Behavior Adversarial Scenario Adversarial Testing Efficiency Robustness
On Adversarial Robustness of Language Models in Transfer Learning
Bohdan Turbal, Anastasiia Mazur, Jiaxu Zhao, Mykola Pechenizkiy
Language Model Adversarial Attack Native Robustness Transfer Learning Adversarial Robustness
Understanding the Impact of Confidence in Retrieval Augmented Generation: A Case Study in the Medical Domain
Shintaro Ozaki, Yuta Kato, Siyuan Feng, Masayo Tomita, Kazuki Hayashi, Ryoma Obara, Masafumi Oyamada, Katsuhiko Hayashi, Hidetaka Kamigaito, Taro Watanabe
Language Model Case Study Retrieval Augmented Generation Faithful Generation Retrieval Augmented Medical Domain High Confidence Confidence Level Calibration Statistic

December 28, 2024

Pushing the Envelope of Low-Bit LLM via Dynamic Error Compensation
Yeonhong Park, Jake Hyun, Hojoon Kim, Jae W. Lee
Language Model Inference Latency Quantization Error Speech Envelope State of the Art Quantization
Improving SSVEP BCI Spellers With Data Augmentation and Language Models
Joseph Zhang, Ruiming Zhang, Kipngeno Koech, David Hill, Kateryna Shapovalenko
Language Model Data Augmentation EEG Data State Visual Evoked Potential

Language Model

Papers

TinyHelen's First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment

Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models

Chunk-Distilled Language Modeling

Rethinking Layer Removal: Preserving Critical Components with Task-Aware Singular Value Decomposition

A review of faithfulness metrics for hallucination assessment in Large Language Models

CancerKG.ORG A Web-scale, Interactive, Verifiable Knowledge Graph-LLM Hybrid for Assisting with Optimal Cancer Treatment and Care

Detection-Fusion for Knowledge Graph Extraction from Videos

Training Software Engineering Agents and Verifiers with SWE-Gym

Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization

Attributing Culture-Conditioned Generations to Pretraining Corpora

UniRS: Unifying Multi-temporal Remote Sensing Tasks through Vision Language Models

Depression and Anxiety Prediction Using Deep Language Models and Transfer Learning

HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving

ICLR: In-Context Learning of Representations

Adversarial Negotiation Dynamics in Generative Language Models

On Adversarial Robustness of Language Models in Transfer Learning

Understanding the Impact of Confidence in Retrieval Augmented Generation: A Case Study in the Medical Domain

Pushing the Envelope of Low-Bit LLM via Dynamic Error Compensation

Improving SSVEP BCI Spellers With Data Augmentation and Language Models