Pre Trained Model
Pre-trained models are foundational large-scale models trained on massive datasets, subsequently adapted for specific downstream tasks using techniques like fine-tuning or parameter-efficient fine-tuning (PEFT). Current research emphasizes improving the efficiency and effectiveness of these adaptation methods, exploring architectures such as Vision Transformers and diffusion models, and developing algorithms like LoRA and its nonlinear extensions to minimize resource consumption while maximizing performance. This field is crucial for advancing various applications, from medical image analysis and environmental sound classification to autonomous driving and natural language processing, by enabling the development of high-performing models with limited data and computational resources.
Papers
Multi-modal Mood Reader: Pre-trained Model Empowers Cross-Subject Emotion Recognition
Yihang Dong, Xuhang Chen, Yanyan Shen, Michael Kwok-Po Ng, Tao Qian, Shuqiang Wang
An Empirical Analysis of Forgetting in Pre-trained Models with Incremental Low-Rank Updates
Albin Soutif--Cormerais, Simone Magistri, Joost van de Weijer, Andew D. Bagdanov
Predicting the Impact of Model Expansion through the Minima Manifold: A Loss Landscape Perspective
Pranshu Malviya, Jerry Huang, Quentin Fournier, Sarath Chandar
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
Wenyu Du, Tongxu Luo, Zihan Qiu, Zeyu Huang, Yikang Shen, Reynold Cheng, Yike Guo, Jie Fu
Parameter Efficient Fine Tuning: A Comprehensive Analysis Across Applications
Charith Chandra Sai Balne, Sreyoshi Bhaduri, Tamoghna Roy, Vinija Jain, Aman Chadha
IMO: Greedy Layer-Wise Sparse Representation Learning for Out-of-Distribution Text Classification with Pre-trained Models
Tao Feng, Lizhen Qu, Zhuang Li, Haolan Zhan, Yuncheng Hua, Gholamreza Haffari