Large Language Model
Large language models (LLMs) are sophisticated AI systems designed to process and generate human-like text, aiming to improve various natural language processing tasks. Current research focuses on enhancing LLM safety, efficiency (through techniques like quantization and optimized decoding), and fairness, as well as improving their ability to perform complex reasoning and handle diverse instructions. These advancements are significant because they address critical limitations in current LLMs and pave the way for broader applications across diverse fields, including healthcare, legal tech, and autonomous systems.
Papers
EUREKHA: Enhancing User Representation for Key Hackers Identification in Underground Forums
Abdoul Nasser Hassane Amadou, Anas Motii, Saida Elouardi, EL Houcine Bergou
Enhancing Robustness in Language-Driven Robotics: A Modular Approach to Failure Reduction
Émiland Garrabé, Pierre Teixeira, Mahdi Khoramshahi, Stéphane Doncieux
Learning the rules of peptide self-assembly through data mining with large language models
Zhenze Yang, Sarah K. Yorke, Tuomas P. J. Knowles, Markus J. Buehler
Gap-Filling Prompting Enhances Code-Assisted Mathematical Reasoning
Mohammad Ghiasvand Mohammadkhani
Reasoning Robustness of LLMs to Adversarial Typographical Errors
Esther Gan, Yiran Zhao, Liying Cheng, Yancan Mao, Anirudh Goyal, Kenji Kawaguchi, Min-Yen Kan, Michael Shieh
Exploring the Alignment Landscape: LLMs and Geometric Deep Models in Protein Representation
Dong Shu, Bingbing Duan, Kai Guo, Kaixiong Zhou, Jiliang Tang, Mengnan Du
Generative Adapter: Contextualizing Language Models in Parameters with A Single Forward Pass
Tong Chen, Hao Fang, Patrick Xia, Xiaodong Liu, Benjamin Van Durme, Luke Zettlemoyer, Jianfeng Gao, Hao Cheng
Abstract2Appendix: Academic Reviews Enhance LLM Long-Context Capabilities
Shengzhi Li, Kittipat Kampa, Rongyu Lin, Bohang Li, Shichao Pei
Alopex: A Computational Framework for Enabling On-Device Function Calls with LLMs
Yide Ran, Zhaozhuo Xu, Yuhang Yao, Zijian Hu, Shanshan Han, Han Jin, Alay Dilipbhai Shah, Jipeng Zhang, Dimitris Stripelis, Tong Zhang, Salman Avestimehr, Chaoyang He
Toward Cultural Interpretability: A Linguistic Anthropological Framework for Describing and Evaluating Large Language Models (LLMs)
Graham M. Jones, Shai Satran, Arvind Satyanarayan
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?
Jonathan Roberts, Kai Han, Samuel Albanie
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Weixin Liang, Lili Yu, Liang Luo, Srinivasan Iyer, Ning Dong, Chunting Zhou, Gargi Ghosh, Mike Lewis, Wen-tau Yih, Luke Zettlemoyer, Xi Victoria Lin
Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives
Hao Sun, Yunyi Shen, Jean-Francois Ton
BitNet a4.8: 4-bit Activations for 1-bit LLMs
Hongyu Wang, Shuming Ma, Furu Wei
Position Paper On Diagnostic Uncertainty Estimation from Large Language Models: Next-Word Probability Is Not Pre-test Probability
Yanjun Gao, Skatje Myers, Shan Chen, Dmitriy Dligach, Timothy A Miller, Danielle Bitterman, Guanhua Chen, Anoop Mayampurath, Matthew Churpek, Majid Afshar
FineTuneBench: How well do commercial fine-tuning APIs infuse knowledge into LLMs?
Eric Wu, Kevin Wu, James Zou
GPTKB: Building Very Large Knowledge Bases from Language Models
Yujia Hu, Shrestha Ghosh, Tuan-Phong Nugyen, Simon Razniewski
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Siming Huang, Tianhao Cheng, Jason Klein Liu, Jiaran Hao, Liuyihan Song, Yang Xu, J. Yang, J.H. Liu, Chenchen Zhang, Linzheng Chai, Ruifeng Yuan, Zhaoxiang Zhang, Jie Fu, Qian Liu, Ge Zhang, Zili Wang, Yuan Qi, Yinghui Xu, Wei Chu
VTechAGP: An Academic-to-General-Audience Text Paraphrase Dataset and Benchmark Models
Ming Cheng, Jiaying Gong, Chenhan Yuan, William A. Ingram, Edward Fox, Hoda Eldardiry
LuxBank: The First Universal Dependency Treebank for Luxembourgish
Alistair Plum, Caroline Döhmer, Emilia Milano, Anne-Marie Lutgen, Christoph Purschke