Fine Tuning
Fine-tuning adapts pre-trained large language models (LLMs) to specific tasks, improving performance and efficiency compared to training from scratch. Current research emphasizes efficient fine-tuning methods like low-rank adaptation (LoRA) and techniques addressing challenges such as catastrophic forgetting and calibration issues, often employing bilevel optimization or adaptive noise allocation for improved performance and privacy. This work is significant because it enables the deployment of powerful LLMs across diverse applications, from medical diagnosis to visual editing, while mitigating resource constraints and privacy concerns.
Papers
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models
Qijun Luo, Hengxu Yu, Xiao Li
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
Fanxu Meng, Zhaohui Wang, Muhan Zhang
ANGOFA: Leveraging OFA Embedding Initialization and Synthetic Data for Angolan Language Model
Osvaldo Luamba Quinjica, David Ifeoluwa Adelani
LayerNorm: A key component in parameter-efficient fine-tuning
Taha ValizadehAslani, Hualou Liang
A Systematic Analysis of Subwords and Cross-Lingual Transfer in Multilingual Translation
Francois Meyer, Jan Buys
Are LLMs Effective Backbones for Fine-tuning? An Experimental Investigation of Supervised LLMs on Chinese Short Text Matching
Shulin Liu, Chengcheng Xu, Hao Liu, Tinghao Yu, Tao Yang
Faster Convergence for Transformer Fine-tuning with Line Search Methods
Philip Kenneweg, Leonardo Galli, Tristan Kenneweg, Barbara Hammer
Selective Mixup Fine-Tuning for Optimizing Non-Decomposable Objectives
Shrinivas Ramasubramanian, Harsh Rangwani, Sho Takemori, Kunal Samanta, Yuhei Umeda, Venkatesh Babu Radhakrishnan
Improving Pre-trained Language Model Sensitivity via Mask Specific losses: A case study on Biomedical NER
Micheal Abaho, Danushka Bollegala, Gary Leeming, Dan Joyce, Iain E Buchan
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
Rui Pan, Xiang Liu, Shizhe Diao, Renjie Pi, Jipeng Zhang, Chi Han, Tong Zhang
What explains the success of cross-modal fine-tuning with ORCA?
Paloma García-de-Herreros, Vagrant Gautam, Philipp Slusallek, Dietrich Klakow, Marius Mosbach
HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models
Wenqiao Zhang, Tianwei Lin, Jiang Liu, Fangxun Shu, Haoyuan Li, Lei Zhang, He Wanggui, Hao Zhou, Zheqi Lv, Hao Jiang, Juncheng Li, Siliang Tang, Yueting Zhuang
FissionFusion: Fast Geometric Generation and Hierarchical Souping for Medical Image Analysis
Santosh Sanjeev, Nuren Zhaksylyk, Ibrahim Almakky, Anees Ur Rehman Hashmi, Mohammad Areeb Qazi, Mohammad Yaqub
Adaptive Ensembles of Fine-Tuned Transformers for LLM-Generated Text Detection
Zhixin Lai, Xuesheng Zhang, Suiyao Chen