Large Language Model
Large language models (LLMs) are sophisticated AI systems designed to process and generate human-like text, aiming to improve various natural language processing tasks. Current research focuses on enhancing LLM safety, efficiency (through techniques like quantization and optimized decoding), and fairness, as well as improving their ability to perform complex reasoning and handle diverse instructions. These advancements are significant because they address critical limitations in current LLMs and pave the way for broader applications across diverse fields, including healthcare, legal tech, and autonomous systems.
Papers
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe
Yuxin Xiao, Shujian Zhang, Wenxuan Zhou, Marzyeh Ghassemi, Sanqiang Zhao
LLMs Are In-Context Reinforcement Learners
Giovanni Monea, Antoine Bosselut, Kianté Brantley, Yoav Artzi
ChatVis: Automating Scientific Visualization with a Large Language Model
Tanwi Mallick, Orcun Yildiz, David Lenz, Tom Peterka
Precise Model Benchmarking with Only a Few Observations
Riccardo Fogliato, Pratik Patil, Nil-Jana Akpinar, Mathew Monfort
Density estimation with LLMs: a geometric investigation of in-context learning trajectories
Toni J.B. Liu, Nicolas Boullé, Raphaël Sarfati, Christopher J. Earls
Enhancing Equity in Large Language Models for Medical Applications
Yuelyu Ji, Wenhe Ma, Sonish Sivarajkumar, Hang Zhang, Eugene Mathew Sadhu, Zhuochun Li, Xizhi Wu, Shyam Visweswaran, Yanshan Wang
Efficient Inference for Large Language Model-based Generative Recommendation
Xinyu Lin, Chaoqun Yang, Wenjie Wang, Yongqi Li, Cunxiao Du, Fuli Feng, See-Kiong Ng, Tat-Seng Chua
Falcon Mamba: The First Competitive Attention-free 7B Language Model
Jingwei Zuo, Maksim Velikanov, Dhia Eddine Rhaiem, Ilyas Chahed, Younes Belkada, Guillaume Kunsch, Hakim Hacid
Investigating large language models for their competence in extracting grammatically sound sentences from transcribed noisy utterances
Alina Wróblewska
Explanation sensitivity to the randomness of large language models: the case of journalistic text classification
Jeremie Bogaert, Marie-Catherine de Marneffe, Antonin Descampe, Louis Escouflaire, Cedrick Fairon, Francois-Xavier Standaert
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
Lijie Yang, Zhihao Zhang, Zhuofu Chen, Zikun Li, Zhihao Jia
Initialization of Large Language Models via Reparameterization to Mitigate Loss Spikes
Kosuke Nishida, Kyosuke Nishida, Kuniko Saito
In-the-loop Hyper-Parameter Optimization for LLM-Based Automated Design of Heuristics
Niki van Stein, Diederick Vermetten, Thomas Bäck
Can LLMs plan paths with extra hints from solvers?
Erik Wu, Sayan Mitra
FAME: Towards Factual Multi-Task Model Editing
Li Zeng, Yingyu Shan, Zeming Liu, Jiashu Yao, Yuhang Guo
Leverage Knowledge Graph and Large Language Model for Law Article Recommendation: A Case Study of Chinese Criminal Law
Yongming Chen, Miner Chen, Ye Zhu, Juan Pei, Siyu Chen, Yu Zhou, Yi Wang, Yifan Zhou, Hao Li, Songan Zhang
Intent Classification for Bank Chatbots through LLM Fine-Tuning
Bibiána Lajčinová, Patrik Valábek, Michal Spišiak
SoK: Towards Security and Safety of Edge AI
Tatjana Wingarz, Anne Lauscher, Janick Edinger, Dominik Kaaser, Stefan Schulte, Mathias Fischer
Strong Model Collapse
Elvis Dohmatob, Yunzhen Feng, Julia Kempe
MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models
Kaichen Huang, Jiahao Huo, Yibo Yan, Kun Wang, Yutao Yue, Xuming Hu