Language Model
Language models are computational systems designed to understand and generate human language, primarily aiming to improve tasks like translation, question answering, and text summarization. Current research focuses on enhancing efficiency (e.g., through novel learning rate schedules and optimized architectures), improving alignment with human preferences (via preference optimization and reward modeling), and addressing biases and limitations (including techniques for mitigating toxicity and enhancing robustness). These advancements have significant implications for various fields, impacting natural language processing research and enabling the development of more powerful and reliable AI applications.
Papers
Wave Network: An Ultra-Small Language Model
Xin Zhang, Victor S.Sheng
Zebra-Llama: A Context-Aware Large Language Model for Democratizing Rare Disease Knowledge
Karthik Soman, Andrew Langdon, Catalina Villouta, Chinmay Agrawal, Lashaw Salta, Braian Peetoom, Gianmarco Bellucci, Orion J Buske
TeleOracle: Fine-Tuned Retrieval-Augmented Generation with Long-Context Support for Network
Nouf Alabbasi, Omar Erak, Omar Alhussein, Ismail Lotfi, Sami Muhaidat, Merouane Debbah
Context-Informed Machine Translation of Manga using Multimodal Large Language Models
Philip Lippmann, Konrad Skublicki, Joshua Tanner, Shonosuke Ishiwatari, Jie Yang
What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length
Lindia Tjuatja, Graham Neubig, Tal Linzen, Sophie Hao
Improving Scientific Hypothesis Generation with Knowledge Grounded Large Language Models
Guangzhi Xiong, Eric Xie, Amir Hassan Shariatmadari, Sikun Guo, Stefan Bekiranov, Aidong Zhang
Improving Steering Vectors by Targeting Sparse Autoencoder Features
Sviatoslav Chalnev, Matthew Siu, Arthur Conmy
Regress, Don't Guess -- A Regression-like Loss on Number Tokens for Language Models
Jonas Zausinger, Lars Pennig, Kacper Chlodny, Vincent Limbach, Anna Ketteler, Thorben Prein, Vishwa Mohan Singh, Michael Morris Danziger, Jannis Born
Can Language Models Learn to Skip Steps?
Tengxiao Liu, Qipeng Guo, Xiangkun Hu, Cheng Jiayang, Yue Zhang, Xipeng Qiu, Zheng Zhang
Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback
Guan-Ting Lin, Prashanth Gurunath Shivakumar, Aditya Gourav, Yile Gu, Ankur Gandhe, Hung-yi Lee, Ivan Bulyko
Can Language Models Enable In-Context Database?
Yu Pan, Hongfeng Yu, Tianjiao Zhao, Jianxin Sun
Thinking Forward and Backward: Effective Backward Planning with Large Language Models
Allen Z. Ren, Brian Ichter, Anirudha Majumdar
Varco Arena: A Tournament Approach to Reference-Free Benchmarking Large Language Models
Seonil Son, Ju-Min Oh, Heegon Jin, Cheolhun Jang, Jeongbeom Jeong, Kuntae Kim
One Arrow, Many Targets: Probing LLMs for Multi-Attribute Controllable Text Summarization
Tathagato Roy, Rahul Mishra
Swan and ArabicMTEB: Dialect-Aware, Arabic-Centric, Cross-Lingual, and Cross-Cultural Embedding Models and Benchmarks
Gagan Bhatia, El Moatez Billah Nagoudi, Abdellah El Mekki, Fakhraddin Alwajih, Muhammad Abdul-Mageed
Do LLMs Know to Respect Copyright Notice?
Jialiang Xu, Shenglan Li, Zhaozhuo Xu, Denghui Zhang