Fine Tuning
Fine-tuning adapts pre-trained large language models (LLMs) to specific tasks, improving performance and efficiency compared to training from scratch. Current research emphasizes efficient fine-tuning methods like low-rank adaptation (LoRA) and techniques addressing challenges such as catastrophic forgetting and calibration issues, often employing bilevel optimization or adaptive noise allocation for improved performance and privacy. This work is significant because it enables the deployment of powerful LLMs across diverse applications, from medical diagnosis to visual editing, while mitigating resource constraints and privacy concerns.
Papers
ReFT: Representation Finetuning for Language Models
Zhengxuan Wu, Aryaman Arora, Zheng Wang, Atticus Geiger, Dan Jurafsky, Christopher D. Manning, Christopher Potts
DreamWalk: Style Space Exploration using Diffusion Guidance
Michelle Shu, Charles Herrmann, Richard Strong Bowen, Forrester Cole, Ramin Zabih
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models
Qijun Luo, Hengxu Yu, Xiao Li
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
Fanxu Meng, Zhaohui Wang, Muhan Zhang
ANGOFA: Leveraging OFA Embedding Initialization and Synthetic Data for Angolan Language Model
Osvaldo Luamba Quinjica, David Ifeoluwa Adelani
LayerNorm: A key component in parameter-efficient fine-tuning
Taha ValizadehAslani, Hualou Liang
A Systematic Analysis of Subwords and Cross-Lingual Transfer in Multilingual Translation
Francois Meyer, Jan Buys
Are LLMs Effective Backbones for Fine-tuning? An Experimental Investigation of Supervised LLMs on Chinese Short Text Matching
Shulin Liu, Chengcheng Xu, Hao Liu, Tinghao Yu, Tao Yang
Faster Convergence for Transformer Fine-tuning with Line Search Methods
Philip Kenneweg, Leonardo Galli, Tristan Kenneweg, Barbara Hammer
Selective Mixup Fine-Tuning for Optimizing Non-Decomposable Objectives
Shrinivas Ramasubramanian, Harsh Rangwani, Sho Takemori, Kunal Samanta, Yuhei Umeda, Venkatesh Babu Radhakrishnan
Improving Pre-trained Language Model Sensitivity via Mask Specific losses: A case study on Biomedical NER
Micheal Abaho, Danushka Bollegala, Gary Leeming, Dan Joyce, Iain E Buchan
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
Rui Pan, Xiang Liu, Shizhe Diao, Renjie Pi, Jipeng Zhang, Chi Han, Tong Zhang