Large Language Model
Large language models (LLMs) are sophisticated AI systems designed to process and generate human-like text, aiming to improve various natural language processing tasks. Current research focuses on enhancing LLM safety, efficiency (through techniques like quantization and optimized decoding), and fairness, as well as improving their ability to perform complex reasoning and handle diverse instructions. These advancements are significant because they address critical limitations in current LLMs and pave the way for broader applications across diverse fields, including healthcare, legal tech, and autonomous systems.
Papers
Improving the Language Understanding Capabilities of Large Language Models Using Reinforcement Learning
Bokai Hu, Sai Ashish Somayajula, Xin Pan, Zihan Huang, Pengtao Xie
Effective Self-Mining of In-Context Examples for Unsupervised Machine Translation with LLMs
Abdellah El Mekki, Muhammad Abdul-Mageed
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Guangxuan Xiao, Jiaming Tang, Jingwei Zuo, Junxian Guo, Shang Yang, Haotian Tang, Yao Fu, Song Han
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Ziyue Li, Tianyi Zhou
Focused ReAct: Improving ReAct through Reiterate and Early Stop
Shuoqiu Li, Han Xu, Haipeng Chen
AFlow: Automating Agentic Workflow Generation
Jiayi Zhang, Jinyu Xiang, Zhaoyang Yu, Fengwei Teng, Xionghui Chen, Jiaqi Chen, Mingchen Zhuge, Xin Cheng, Sirui Hong, Jinlin Wang, Bingnan Zheng, Bang Liu, Yuyu Luo, Chenglin Wu
SplitLLM: Collaborative Inference of LLMs for Model Placement and Throughput Optimization
Akrit Mudvari, Yuang Jiang, Leandros Tassiulas
Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs
Ishan Jindal, Chandana Badrinath, Pranjal Bharti, Lakkidi Vinay, Sachin Dev Sharma
Towards LLM-guided Efficient and Interpretable Multi-linear Tensor Network Rank Selection
Giorgos Iacovides, Wuyang Zhou, Danilo Mandic
Large Language Models Are Active Critics in NLG Evaluation
Shuying Xu, Junjie Hu, Ming Jiang
Beyond Right and Wrong: Mitigating Cold Start in Knowledge Tracing Using Large Language Model and Option Weight
JongWoo Kim, SeongYeub Chu, Bryan Wong, Mun Yi
Large Language Model Evaluation via Matrix Nuclear-Norm
Yahan Li, Tingyu Xia, Yi Chang, Yuan Wu
Double Jeopardy and Climate Impact in the Use of Large Language Models: Socio-economic Disparities and Reduced Utility for Non-English Speakers
Aivin V. Solatorio, Gabriel Stefanini Vicente, Holly Krambeck, Olivier Dupriez
Federated Data-Efficient Instruction Tuning for Large Language Models
Zhen Qin, Zhaomin Wu, Bingsheng He, Shuiguang Deng
Cultural Fidelity in Large-Language Models: An Evaluation of Online Language Resources as a Driver of Model Performance in Value Representation
Sharif Kazemi, Gloria Gerhardt, Jonty Katz, Caroline Ida Kuria, Estelle Pan, Umang Prabhakar
Model-Based Differentially Private Knowledge Transfer for Large Language Models
Zhaomin Wu, Jizhou Guo, Junyi Hou, Bingsheng He, Lixin Fan, Qiang Yang
TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs
Haochuan Wang, Xiachong Feng, Lei Li, Zhanyue Qin, Dianbo Sui, Lingpeng Kong
Skill Learning Using Process Mining for Large Language Model Plan Generation
Andrei Cosmin Redis, Mohammadreza Fani Sani, Bahram Zarrin, Andrea Burattin
On Calibration of LLM-based Guard Models for Reliable Content Moderation
Hongfu Liu, Hengguan Huang, Hao Wang, Xiangming Gu, Ye Wang
A Unified Approach to Routing and Cascading for LLMs
Jasper Dekoninck, Maximilian Baader, Martin Vechev