Pre Training
Pre-training involves initially training large models on massive datasets to learn generalizable features before fine-tuning them for specific tasks. Current research focuses on improving data efficiency through techniques like carefully curated datasets, task-oriented pre-training, and novel data selection methods, often employing transformer architectures and contrastive learning. These advancements aim to reduce computational costs and enhance model performance across diverse domains, impacting fields ranging from natural language processing and computer vision to medical imaging and graph analysis. The ultimate goal is to create more robust, efficient, and adaptable models with reduced environmental impact.
Papers
Intuitive Multilingual Audio-Visual Speech Recognition with a Single-Trained Model
Joanna Hong, Se Jin Park, Yong Man Ro
ULTRA-DP: Unifying Graph Pre-training with Multi-task Graph Dual Prompt
Mouxiang Chen, Zemin Liu, Chenghao Liu, Jundong Li, Qiheng Mao, Jianling Sun
Pre-Training LiDAR-Based 3D Object Detectors Through Colorization
Tai-Yu Pan, Chenyang Ma, Tianle Chen, Cheng Perng Phoo, Katie Z Luo, Yurong You, Mark Campbell, Kilian Q. Weinberger, Bharath Hariharan, Wei-Lun Chao
Toward a Foundation Model for Time Series Data
Chin-Chia Michael Yeh, Xin Dai, Huiyuan Chen, Yan Zheng, Yujie Fan, Audrey Der, Vivian Lai, Zhongfang Zhuang, Junpeng Wang, Liang Wang, Wei Zhang
Pre-Training and Fine-Tuning Generative Flow Networks
Ling Pan, Moksh Jain, Kanika Madan, Yoshua Bengio
Fragment-based Pretraining and Finetuning on Molecular Graphs
Kha-Dinh Luong, Ambuj Singh
Clinical Text Deduplication Practices for Efficient Pretraining and Improved Clinical Tasks
Isotta Landi, Eugenia Alleva, Alissa A. Valentine, Lauren A. Lepow, Alexander W. Charney
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks
Hao Chen, Jindong Wang, Ankit Shah, Ran Tao, Hongxin Wei, Xing Xie, Masashi Sugiyama, Bhiksha Raj