Pre Training
Pre-training involves initially training large models on massive datasets to learn generalizable features before fine-tuning them for specific tasks. Current research focuses on improving data efficiency through techniques like carefully curated datasets, task-oriented pre-training, and novel data selection methods, often employing transformer architectures and contrastive learning. These advancements aim to reduce computational costs and enhance model performance across diverse domains, impacting fields ranging from natural language processing and computer vision to medical imaging and graph analysis. The ultimate goal is to create more robust, efficient, and adaptable models with reduced environmental impact.
Papers
Parallel Structures in Pre-training Data Yield In-Context Learning
Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He
Amplifying Training Data Exposure through Fine-Tuning with Pseudo-Labeled Memberships
Myung Gyo Oh, Hong Eun Ahn, Leo Hyun Park, Taekyoung Kwon
Endowing Pre-trained Graph Models with Provable Fairness
Zhongjian Zhang, Mengmei Zhang, Yue Yu, Cheng Yang, Jiawei Liu, Chuan Shi
AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error
Jonas Ricker, Denis Lukovnikov, Asja Fischer
SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text Tasks
Xingning Dong, Qingpei Guo, Tian Gan, Qing Wang, Jianlong Wu, Xiangyuan Ren, Yuan Cheng, Wei Chu
A Medical Data-Effective Learning Benchmark for Highly Efficient Pre-training of Foundation Models
Wenxuan Yang, Weimin Tan, Yuqi Sun, Bo Yan
How Useful is Continued Pre-Training for Generative Unsupervised Domain Adaptation?
Rheeya Uppaal, Yixuan Li, Junjie Hu