Large Language Model
Large language models (LLMs) are sophisticated AI systems designed to process and generate human-like text, aiming to improve various natural language processing tasks. Current research focuses on enhancing LLM safety, efficiency (through techniques like quantization and optimized decoding), and fairness, as well as improving their ability to perform complex reasoning and handle diverse instructions. These advancements are significant because they address critical limitations in current LLMs and pave the way for broader applications across diverse fields, including healthcare, legal tech, and autonomous systems.
Papers
Exploring Hierarchical Molecular Graph Representation in Multimodal LLMs
Chengxin Hu, Hao Li
Self-Calibrated Listwise Reranking with Large Language Models
Ruiyang Ren, Yuhao Wang, Kun Zhou, Wayne Xin Zhao, Wenjie Wang, Jing Liu, Ji-Rong Wen, Tat-Seng Chua
Measure-to-measure interpolation using Transformers
Borjan Geshkovski, Philippe Rigollet, Domènec Ruiz-Balet
Best Practices for Distilling Large Language Models into BERT for Web Search Ranking
Dezhi Ye, Junwei Hu, Jiabin Fan, Bowen Tian, Jie Liu, Haijin Liang, Jin Ma
Meta-Reasoning Improves Tool Use in Large Language Models
Lisa Alazraki, Marek Rei
One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity
Sonia K. Murthy, Tomer Ullman, Jennifer Hu
DELIFT: Data Efficient Language model Instruction Fine Tuning
Ishika Agarwal, Krishna Killamsetty, Lucian Popa, Marina Danilevksy
Bayesian Calibration of Win Rate Estimation with LLM Evaluators
Yicheng Gao, Gonghan Xu, Zhe Wang, Arman Cohan
Variational Low-Rank Adaptation Using IVON
Bai Cong, Nico Daheim, Yuesong Shen, Daniel Cremers, Rio Yokota, Mohammad Emtiyaz Khan, Thomas Möllenhoff
Leveraging LLMs to Enable Natural Language Search on Go-to-market Platforms
Jesse Yao, Saurav Acharya, Priyaranjan Parida, Srinivas Attipalli, Ali Dasdan
Unlearning in- vs. out-of-distribution data in LLMs under gradient-based method
Teodora Baluta, Pascal Lamblin, Daniel Tarlow, Fabian Pedregosa, Gintare Karolina Dziugaite
Benchmarking Large Language Models with Integer Sequence Generation Tasks
Daniel O'Malley, Manish Bhattarai, Javier Santos
Measuring short-form factuality in large language models
Jason Wei, Nguyen Karina, Hyung Won Chung, Yunxin Joy Jiao, Spencer Papay, Amelia Glaese, John Schulman, William Fedus
Robust and Efficient Fine-tuning of LLMs with Bayesian Reparameterization of Low-Rank Adaptation
Vaibhav Seth, Arinjay Pathak, Ayan Sengupta, Natraj Raman, Sriram Gopalakrishnan, Tanmoy Chakraborty
CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models
Jierui Li, Hung Le, Yinbo Zhou, Caiming Xiong, Silvio Savarese, Doyen Sahoo
A Multilingual Sentiment Lexicon for Low-Resource Language Translation using Large Languages Models and Explainable AI
Melusi Malinga, Isaac Lupanda, Mike Wa Nkongolo, Phil van Deventer
Unfair Alignment: Examining Safety Alignment Across Vision Encoder Layers in Vision-Language Models
Saketh Bachu, Erfan Shayegani, Trishna Chakraborty, Rohit Lal, Arindam Dutta, Chengyu Song, Yue Dong, Nael Abu-Ghazaleh, Amit K. Roy-Chowdhury
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
Haolin Chen, Yihao Feng, Zuxin Liu, Weiran Yao, Akshara Prabhakar, Shelby Heinecke, Ricky Ho, Phil Mui, Silvio Savarese, Caiming Xiong, Huan Wang
LSHBloom: Memory-efficient, Extreme-scale Document Deduplication
Arham Khan, Robert Underwood, Carlo Siebenschuh, Yadu Babuji, Aswathy Ajith, Kyle Hippe, Ozan Gokdemir, Alexander Brace, Kyle Chard, Ian Foster
Multimodal Structure-Aware Quantum Data Processing
Hala Hawashin, Mehrnoosh Sadrzadeh