Large Language Model
Large language models (LLMs) are sophisticated AI systems designed to process and generate human-like text, aiming to improve various natural language processing tasks. Current research focuses on enhancing LLM safety, efficiency (through techniques like quantization and optimized decoding), and fairness, as well as improving their ability to perform complex reasoning and handle diverse instructions. These advancements are significant because they address critical limitations in current LLMs and pave the way for broader applications across diverse fields, including healthcare, legal tech, and autonomous systems.
Papers
QueEn: A Large Language Model for Quechua-English Translation
Junhao Chen, Peng Shu, Yiwei Li, Huaqin Zhao, Hanqi Jiang, Yi Pan, Yifan Zhou, Zhengliang Liu, Lewis C Howe, Tianming Liu
A Practical Examination of AI-Generated Text Detectors for Large Language Models
Brian Tufts, Xuandong Zhao, Lei Li
The Prompt Canvas: A Literature-Based Practitioner Guide for Creating Effective Prompts in Large Language Models
Michael Hewing, Vincent Leinhos
Flash Communication: Reducing Tensor Parallelization Bottleneck for Fast Large Language Model Inference
Qingyuan Li, Bo Zhang, Liang Ye, Yifan Zhang, Wei Wu, Yerui Sun, Lin Ma, Yuchen Xie
C$^2$LEVA: Toward Comprehensive and Contamination-Free Language Model Evaluation
Yanyang Li, Tin Long Wong, Cheung To Hung, Jianqiao Zhao, Duo Zheng, Ka Wai Liu, Michael R. Lyu, Liwei Wang
Rethinking Time Series Forecasting with LLMs via Nearest Neighbor Contrastive Learning
Jayanie Bogahawatte, Sachith Seneviratne, Maneesha Perera, Saman Halgamuge
GUIDE: A Global Unified Inference Engine for Deploying Large Language Models in Heterogeneous Environments
Yanyu Chen, Ganhong Huang
Foundation Models for Low-Resource Language Education (Vision Paper)
Zhaojun Ding, Zhengliang Liu, Hanqi Jiang, Yizhu Gao, Xiaoming Zhai, Tianming Liu, Ninghao Liu
Ltri-LLM: Streaming Long Context Inference for LLMs with Training-Free Dynamic Triangular Attention Pattern
Hongyin Tang, Di Xiu, Lanrui Wang, Xiurui Geng, Jingang Wang, Xunliang Cai
BESSTIE: A Benchmark for Sentiment and Sarcasm Classification for Varieties of English
Dipankar Srirag, Aditya Joshi, Jordan Painter, Diptesh Kanojia
Transformers Struggle to Learn to Search
Abulhair Saparov, Srushti Pawar, Shreyas Pimpalgaonkar, Nitish Joshi, Richard Yuanzhe Pang, Vishakh Padmakumar, Seyed Mehran Kazemi, Najoung Kim, He He
Privacy-Preserving Retrieval Augmented Generation with Differential Privacy
Tatsuki Koga, Ruihan Wu, Kamalika Chaudhuri
Smoothie: Label Free Language Model Routing
Neel Guha, Mayee F. Chen, Trevor Chow, Ishan S. Khare, Christopher Ré
Improving LLM Group Fairness on Tabular Data via In-Context Learning
Valeriia Cherepanova, Chia-Jung Lee, Nil-Jana Akpinar, Riccardo Fogliato, Martin Andres Bertran, Michael Kearns, James Zou
Give me Some Hard Questions: Synthetic Data Generation for Clinical QA
Fan Bai, Keith Harrigian, Joel Stremmel, Hamid Hassanzadeh, Ardavan Saeedi, Mark Dredze
EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
Lu Qiu, Yuying Ge, Yi Chen, Yixiao Ge, Ying Shan, Xihui Liu
Understanding Hidden Computations in Chain-of-Thought Reasoning
Aryasomayajula Ram Bharadwaj
Retrieval-Augmented Machine Translation with Unstructured Knowledge
Jiaan Wang, Fandong Meng, Yingxue Zhang, Jie Zhou
Liquid: Language Models are Scalable Multi-modal Generators
Junfeng Wu, Yi Jiang, Chuofan Ma, Yuliang Liu, Hengshuang Zhao, Zehuan Yuan, Song Bai, Xiang Bai
The Hyperfitting Phenomenon: Sharpening and Stabilizing LLMs for Open-Ended Text Generation
Fredrik Carlsson, Fangyu Liu, Daniel Ward, Murathan Kurfali, Joakim Nivre