Pre Trained Model
Pre-trained models are foundational large-scale models trained on massive datasets, subsequently adapted for specific downstream tasks using techniques like fine-tuning or parameter-efficient fine-tuning (PEFT). Current research emphasizes improving the efficiency and effectiveness of these adaptation methods, exploring architectures such as Vision Transformers and diffusion models, and developing algorithms like LoRA and its nonlinear extensions to minimize resource consumption while maximizing performance. This field is crucial for advancing various applications, from medical image analysis and environmental sound classification to autonomous driving and natural language processing, by enabling the development of high-performing models with limited data and computational resources.
Papers
Let Me DeCode You: Decoder Conditioning with Tabular Data
Tomasz Szczepański, Michal K. Grzeszczyk, Szymon Płotka, Arleta Adamowicz, Piotr Fudalej, Przemysław Korzeniowski, Tomasz Trzciński, Arkadiusz Sitek
Movie Recommendation with Poster Attention via Multi-modal Transformer Feature Fusion
Linhan Xia, Yicheng Yang, Ziou Chen, Zheng Yang, Shengxin Zhu
Constructing Concept-based Models to Mitigate Spurious Correlations with Minimal Human Effort
Jeeyung Kim, Ze Wang, Qiang Qiu
Self-Evaluation as a Defense Against Adversarial Attacks on LLMs
Hannah Brown, Leon Lin, Kenji Kawaguchi, Michael Shieh
SAFT: Towards Out-of-Distribution Generalization in Fine-Tuning
Bac Nguyen, Stefan Uhlich, Fabien Cardinaux, Lukas Mauch, Marzieh Edraki, Aaron Courville
Knowledge Composition using Task Vectors with Learned Anisotropic Scaling
Frederic Z. Zhang, Paul Albert, Cristian Rodriguez-Opazo, Anton van den Hengel, Ehsan Abbasnejad
Single Parent Family: A Spectrum of Family Members from a Single Pre-Trained Foundation Model
Habib Hajimolahoseini, Mohammad Hassanpour, Foozhan Ataiefard, Boxing Chen, Yang Liu
Structure-aware World Model for Probe Guidance via Large-scale Self-supervised Pre-train
Haojun Jiang, Meng Li, Zhenguo Sun, Ning Jia, Yu Sun, Shaqi Luo, Shiji Song, Gao Huang
DPO: Dual-Perturbation Optimization for Test-time Adaptation in 3D Object Detection
Zhuoxiao Chen, Zixin Wang, Yadan Luo, Sen Wang, Zi Huang
Federating to Grow Transformers with Constrained Resources without Model Sharing
Shikun Shen, Yifei Zou, Yuan Yuan, Yanwei Zheng, Peng Li, Xiuzhen Cheng, Dongxiao Yu