Fine Tuning
Fine-tuning adapts pre-trained large language models (LLMs) to specific tasks, improving performance and efficiency compared to training from scratch. Current research emphasizes efficient fine-tuning methods like low-rank adaptation (LoRA) and techniques addressing challenges such as catastrophic forgetting and calibration issues, often employing bilevel optimization or adaptive noise allocation for improved performance and privacy. This work is significant because it enables the deployment of powerful LLMs across diverse applications, from medical diagnosis to visual editing, while mitigating resource constraints and privacy concerns.
Papers
Where to start? Analyzing the potential value of intermediate models
Leshem Choshen, Elad Venezian, Shachar Don-Yehia, Noam Slonim, Yoav Katz
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Yaqing Wang, Sahaj Agarwal, Subhabrata Mukherjee, Xiaodong Liu, Jing Gao, Ahmed Hassan Awadallah, Jianfeng Gao
Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5
Irina Bigoulaeva, Rachneet Sachdeva, Harish Tayyar Madabushi, Aline Villavicencio, Iryna Gurevych
A baseline revisited: Pushing the limits of multi-segment models for context-aware translation
Suvodeep Majumder, Stanislas Lauly, Maria Nadejde, Marcello Federico, Georgiana Dinu
lo-fi: distributed fine-tuning without communication
Mitchell Wortsman, Suchin Gururangan, Shen Li, Ali Farhadi, Ludwig Schmidt, Michael Rabbat, Ari S. Morcos
Improving Stability of Fine-Tuning Pretrained Language Models via Component-Wise Gradient Norm Clipping
Chenghao Yang, Xuezhe Ma
Learning from the Dictionary: Heterogeneous Knowledge Guided Fine-tuning for Chinese Spell Checking
Yinghui Li, Shirong Ma, Qingyu Zhou, Zhongli Li, Li Yangning, Shulin Huang, Ruiyang Liu, Chao Li, Yunbo Cao, Haitao Zheng