Language Model
Language models are computational systems designed to understand and generate human language, primarily aiming to improve tasks like translation, question answering, and text summarization. Current research focuses on enhancing efficiency (e.g., through novel learning rate schedules and optimized architectures), improving alignment with human preferences (via preference optimization and reward modeling), and addressing biases and limitations (including techniques for mitigating toxicity and enhancing robustness). These advancements have significant implications for various fields, impacting natural language processing research and enabling the development of more powerful and reliable AI applications.
Papers
Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Fangyu Lei, Jixuan Chen, Yuxiao Ye, Ruisheng Cao, Dongchan Shin, Hongjin Su, Zhaoqing Suo, Hongcheng Gao, Wenjing Hu, Pengcheng Yin, Victor Zhong, Caiming Xiong, Ruoxi Sun, Qian Liu, Sida Wang, Tao Yu
EUR/USD Exchange Rate Forecasting incorporating Text Mining Based on Pre-trained Language Models and Deep Learning Methods
Xiangyu Shi, Hongcheng Ding, Salaar Faroog, Deshinta Arrova Dewi, Shamsul Nahar Abdullah, Bahiah A Malek
Model Stealing for Any Low-Rank Language Model
Allen Liu, Ankur Moitra
SecEncoder: Logs are All You Need in Security
Muhammed Fatih Bulut, Yingqi Liu, Naveed Ahmad, Maximilian Turner, Sami Ait Ouahmane, Cameron Andrews, Lloyd Greenwald
Controlled Evaluation of Syntactic Knowledge in Multilingual Language Models
Daria Kryvosheieva, Roger Levy
Controllable Context Sensitivity and the Knob Behind It
Julian Minder, Kevin Du, Niklas Stoehr, Giovanni Monea, Chris Wendler, Robert West, Ryan Cotterell
Warmstarting for Scaling Language Models
Neeratyoy Mallik, Maciej Janowski, Johannes Hog, Herilalaina Rakotoarison, Aaron Klein, Josif Grabocka, Frank Hutter
SetLexSem Challenge: Using Set Operations to Evaluate the Lexical and Semantic Robustness of Language Models
Bardiya Akhbari, Manish Gawali, Nicholas A. Dronen
Richer Output for Richer Countries: Uncovering Geographical Disparities in Generated Stories and Travel Recommendations
Kirti Bhagat, Kinshuk Vasisht, Danish Pruthi
The Surprising Effectiveness of Test-Time Training for Abstract Reasoning
Ekin Akyürek, Mehul Damani, Linlu Qiu, Han Guo, Yoon Kim, Jacob Andreas
Contextualized Evaluations: Taking the Guesswork Out of Language Model Evaluations
Chaitanya Malaviya, Joseph Chee Chang, Dan Roth, Mohit Iyyer, Mark Yatskar, Kyle Lo
TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models
Matheus Simão, Fabiano Prado, Omar Abdul Wahab, Anderson Avila
The Super Weight in Large Language Models
Mengxia Yu, De Wang, Qi Shan, Colorado Reed, Alvin Wan
NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics
David Robinson, Marius Miron, Masato Hagiwara, Olivier Pietquin
Counterfactual Generation from Language Models
Shauli Ravfogel, Anej Svete, Vésteinn Snæbjarnarson, Ryan Cotterell
Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Chih-Kai Yang, Yu-Kuan Fu, Chen-An Li, Yi-Cheng Lin, Yu-Xiang Lin, Wei-Chih Chen, Ho Lam Chung, Chun-Yi Kuan, Wei-Ping Huang, Ke-Han Lu, Tzu-Quan Lin, Hsiu-Hsuan Wang, En-Pei Hu, Chan-Jan Hsu, Liang-Hsuan Tseng, I-Hsiang Chiu, Ulin Sanga, Xuanjun Chen, Po-chun Hsu, Shu-wen Yang, Hung-yi Lee
Transformer verbatim in-context retrieval across time and scale
Kristijan Armeni, Marko Pranjić, Senja Pollak
On Active Privacy Auditing in Supervised Fine-tuning for White-Box Language Models
Qian Sun, Hanpeng Wu, Xi Sheryl Zhang
Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition
Yoshiki Masuyama, Koichi Miyazaki, Masato Murata
HarmLevelBench: Evaluating Harm-Level Compliance and the Impact of Quantization on Model Alignment
Yannis Belkhiter, Giulio Zizzo, Sergio Maffeis