Large Language Model
Large language models (LLMs) are sophisticated AI systems designed to process and generate human-like text, aiming to improve various natural language processing tasks. Current research focuses on enhancing LLM safety, efficiency (through techniques like quantization and optimized decoding), and fairness, as well as improving their ability to perform complex reasoning and handle diverse instructions. These advancements are significant because they address critical limitations in current LLMs and pave the way for broader applications across diverse fields, including healthcare, legal tech, and autonomous systems.
Papers
Scaling laws for post-training quantized large language models
Zifei Xu, Alexander Lan, Wanzin Yazar, Tristan Webb, Sayeh Sharify, Xin Wang
Planning Anything with Rigor: General-Purpose Zero-Shot Planning with LLM-based Formalized Programming
Yilun Hao, Yang Zhang, Chuchu Fan
Bridging Large Language Models and Graph Structure Learning Models for Robust Representation Learning
Guangxin Su, Yifan Zhu, Wenjie Zhang, Hanchen Wang, Ying Zhang
Generative AI's aggregated knowledge versus web-based curated knowledge
Ted Selker, Yunzi Wu
Beyond the Comfort Zone: Emerging Solutions to Overcome Challenges in Integrating LLMs into Software Products
Nadia Nahar, Christian Kästner, Jenna Butler, Chris Parnin, Thomas Zimmermann, Christian Bird
MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router
Yanyue Xie, Zhi Zhang, Ding Zhou, Cong Xie, Ziang Song, Xin Liu, Yanzhi Wang, Xue Lin, An Xu
Impacts of Continued Legal Pre-Training and IFT on LLMs' Latent Representations of Human-Defined Legal Concepts
Shaun Ho
SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing
Zhiyuan Zhang, DongDong Chen, Jing Liao
NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models
Han Han, Tong Zhu, Xiang Zhang, Mengsong Wu, Hao Xiong, Wenliang Chen
FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting
Zhe Li, Xiangfei Qiu, Peng Chen, Yihang Wang, Hanyin Cheng, Yang Shu, Jilin Hu, Chenjuan Guo, Aoying Zhou, Qingsong Wen, Christian S. Jensen, Bin Yang
Language Models Encode Numbers Using Digit Representations in Base 10
Amit Arnold Levy, Mor Geva
Personas with Attitudes: Controlling LLMs for Diverse Data Annotation
Leon Fröhling, Gianluca Demartini, Dennis Assenmacher
DySpec: Faster Speculative Decoding with Dynamic Token Tree Structure
Yunfan Xiong, Ruoyu Zhang, Yanzeng Li, Tianhao Wu, Lei Zou
Converging to a Lingua Franca: Evolution of Linguistic Regions and Semantics Alignment in Multilingual Large Language Models
Hongchuan Zeng, Senyu Han, Lu Chen, Kai Yu
IntGrad MT: Eliciting LLMs' Machine Translation Capabilities with Sentence Interpolation and Gradual MT
Seung-Woo Choi, Ga-Hyun Yoo, Jay-Yoon Lee
Retrieval Augmented Spelling Correction for E-Commerce Applications
Xuan Guo, Rohit Patki, Dante Everaert, Christopher Potts
Transformer Layer Injection: A Novel Approach for Efficient Upscaling of Large Language Models
James Vo
Measuring Spiritual Values and Bias of Large Language Models
Songyuan Liu, Ziyang Zhang, Runze Yan, Wei Wu, Carl Yang, Jiaying Lu
Causal Reasoning in Large Language Models: A Knowledge Graph Approach
Yejin Kim, Eojin Kang, Juae Kim, H. Howie Huang
Y-Mol: A Multiscale Biomedical Knowledge-Guided Large Language Model for Drug Development
Tengfei Ma, Xuan Lin, Tianle Li, Chaoyi Li, Long Chen, Peng Zhou, Xibao Cai, Xinyu Yang, Daojian Zeng, Dongsheng Cao, Xiangxiang Zeng