Pre Trained Transformer
Pre-trained transformer models are foundational neural networks achieving state-of-the-art results across diverse tasks by leveraging massive datasets for initial training, followed by fine-tuning for specific applications. Current research emphasizes improving efficiency, including parameter reduction techniques like low-rank factorization and early exit strategies, and exploring effective transfer learning methods across modalities (e.g., image to video, text to speech). This work is significant because it enables the application of powerful transformer architectures to resource-constrained settings and expands their utility beyond their original training domains, impacting fields from natural language processing and computer vision to medical image analysis and even military strategy.
Papers
Language models are good pathologists: using attention-based sequence reduction and text-pretrained transformers for efficient WSI classification
Juan I. Pisula, Katarzyna Bozek
Finding Skill Neurons in Pre-trained Transformer-based Language Models
Xiaozhi Wang, Kaiyue Wen, Zhengyan Zhang, Lei Hou, Zhiyuan Liu, Juanzi Li
Foreground Guidance and Multi-Layer Feature Fusion for Unsupervised Object Discovery with Transformers
Zhiwei Lin, Zengyu Yang, Yongtao Wang
Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster Fine-tuning with Less Labels in Speech Processing
Hao Yang, Jinming Zhao, Gholamreza Haffari, Ehsan Shareghi