Pre Training Task
Pre-training involves training a model on a large dataset before fine-tuning it for a specific task, aiming to improve efficiency and performance, especially when labeled data is scarce. Current research focuses on optimizing pre-training tasks and architectures, including transformers and graph neural networks, for various modalities like text, images, and time series data, exploring techniques like masked autoencoding and contrastive learning. This approach has shown significant impact across diverse fields, from natural language processing and computer vision to drug discovery and network intrusion detection, by enabling faster training and improved generalization on downstream tasks. However, the effectiveness of pre-training is highly dependent on the similarity between the pre-training and downstream tasks, and the optimal approach varies across different data types and model architectures.
Papers
Improving Information Extraction on Business Documents with Specific Pre-Training Tasks
Thibault Douzon, Stefan Duffner, Christophe Garcia, Jérémy Espinas
Examining the Effect of Pre-training on Time Series Classification
Jiashu Pu, Shiwei Zhao, Ling Cheng, Yongzhu Chang, Runze Wu, Tangjie Lv, Rongsheng Zhang