Pre Trained Transformer

Pre-trained transformer models are foundational neural networks achieving state-of-the-art results across diverse tasks by leveraging massive datasets for initial training, followed by fine-tuning for specific applications. Current research emphasizes improving efficiency, including parameter reduction techniques like low-rank factorization and early exit strategies, and exploring effective transfer learning methods across modalities (e.g., image to video, text to speech). This work is significant because it enables the application of powerful transformer architectures to resource-constrained settings and expands their utility beyond their original training domains, impacting fields from natural language processing and computer vision to medical image analysis and even military strategy.

Papers

March 19, 2024

March 18, 2024

Emotion Detection with Transformers: A Comparative Study
Mahdi Rezapour
Transformer Megatron Decepticons Comparative Study Transformer Based Model Pre Trained Transformer User Sentiment Emotion Datasets

March 14, 2024

uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio Mixtures
Afrina Tabassum, Dung Tran, Trung Dang, Ismini Lourentzou, Kazuhito Koishida
Masked Autoencoders Pre Trained Transformer Level Representation Automatic Tuning Instance Discrimination Audio Mixture Contrastive Tuning

March 9, 2024

Hufu: A Modality-Agnositc Watermarking System for Pre-Trained Transformers via Permutation Equivariance
Hengyuan Xu, Liyao Xiang, Xingjun Ma, Borui Yang, Baochun Li
Pre Trained Transformer Watermarking Method Agnostic Watermarking Permutation Equivariance Watermark Detection

February 11, 2024

Multi-Modal Emotion Recognition by Text, Speech and Video Using Pretrained Transformers
Minoo Shayaninasab, Bagher Babaali
Text Modality Emotion Recognition Speech Analysis Source Video Pre Trained Transformer Multimodal Emotion Recognition Automatic Emotion Recognition

February 10, 2024

Should I try multiple optimizers when fine-tuning pre-trained Transformers for NLP tasks? Should I tune their hyperparameters?
Nefeli Gkouti, Prodromos Malakasiotis, Stavros Toumpis, Ion Androutsopoulos
Stochastic Gradient Descent NLP Task Neural Network Architecture Pre Trained Transformer Related Hyperparameters Better Optimizers Adaptive Optimizers

February 4, 2024

Timer: Generative Pre-trained Transformers Are Large Time Series Models
Yong Liu, Haoran Zhang, Chenyu Li, Xiangdong Huang, Jianmin Wang, Mingsheng Long
Large Language Model Time Series Generative Question Pre Trained Transformer Time Series Transformer Large Time Series Model

February 1, 2024

COA-GPT: Generative Pre-trained Transformers for Accelerated Course of Action Development in Military Operations
Vinicius G. Goecks, Nicholas Waytowich
Generative Question Pre Trained Transformer Action Sequence State of the Art Reinforcement Military Application

January 9, 2024

Applying Large Language Models API to Issue Classification Problem
Gabriel Aracena, Kyle Luster, Fabio Santos, Igor Steinmacher, Marco A. Gerosa
Pre Trained Transformer Classification Problem Bug Report Priority Based

January 6, 2024

Plug-and-Play Transformer Modules for Test-Time Adaptation
Xiangyu Chang, Sk Miraj Ahmed, Srikanth V. Krishnamurthy, Basak Guler, Ananthram Swami, Samet Oymak, Amit K. Roy-Chowdhury
Domain Adaptation Transformer Model Test Time Adaptation Pre Trained Transformer Test Time Parameter Efficient Tuning

December 1, 2023

Nonparametric Variational Regularisation of Pretrained Transformers
Fabio Fehr, James Henderson
Large Language Model Domain Generalization Pre Trained Transformer Variational Information Bottleneck Large Scale Pre Training Variational Regularization

November 19, 2023

SecureBERT and LLAMA 2 Empowered Control Area Network Intrusion Detection and Classification
Xuemei Li, Huirong Fu
Classification Code Transformer Based Model Pre Trained Transformer Intrusion Detection LLaMa LlamaCare Controller Area Network

November 13, 2023

Finding and Editing Multi-Modal Neurons in Pre-Trained Transformers
Haowen Pan, Yixin Cao, Xiaozhi Wang, Xun Yang, Meng Wang
Inherent Interpretability Multi Modal Large Language Model Pre Trained Transformer Cross Modal Representation Textual Representation

November 2, 2023

Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion
Lunjun Zhang, Yuwen Xiong, Ze Yang, Sergio Casas, Rui Hu, Raquel Urtasun
Autonomous Driving World Model Pre Trained Transformer Discrete Diffusion Scalable Generative Model

November 1, 2023

Transformers are Provably Optimal In-context Estimators for Wireless Communications
Vishnu Teja Kunde, Vicram Rajagopalan, Chandra Shekhara Kaushik Valmeekam, Krishna Narayanan, Srinivas Shakkottai, Dileep Kalathil, Jean-Francois Chamberland
Transformer Megatron Decepticons Context Learning Pre Trained Transformer

October 31, 2023

Causal Interpretation of Self-Attention in Pre-Trained Transformers
Raanan Y. Rohekar, Yaniv Gurwicz, Shami Nisimov
Self Attention Causal Structure Pre Trained Transformer Causal Explanation Causal Interpretation Zero Shot Causal

October 30, 2023

ExPT: Synthetic Pretraining for Few-Shot Experimental Design
Tung Nguyen, Sudhanshu Agrawal, Aditya Grover
Pre Trained Transformer Experimental Design Generation Task Unsupervised Pre Training Shot Approach Synthetic Pre Training

October 19, 2023

Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer
Qingru Zhang, Dhananjay Ram, Cole Hawkins, Sheng Zha, Tuo Zhao
Multi Layer Pre Trained Transformer Sparse Attention Long Range Transformer

October 3, 2023

Selective Feature Adapter for Dense Vision Transformers
Xueqing Deng, Qi Fan, Xiaojie Jin, Linjie Yang, Peng Wang
Swin Transformer Pre Trained Transformer Task Specific Adapter Dense Vision Task Dense Visual Prediction

Pre Trained Transformer

Papers

Quantifying uncertainty in lung cancer segmentation with foundation models applied to mixed-domain datasets

Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization

Emotion Detection with Transformers: A Comparative Study

uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio Mixtures

Hufu: A Modality-Agnositc Watermarking System for Pre-Trained Transformers via Permutation Equivariance

Multi-Modal Emotion Recognition by Text, Speech and Video Using Pretrained Transformers

Should I try multiple optimizers when fine-tuning pre-trained Transformers for NLP tasks? Should I tune their hyperparameters?

Timer: Generative Pre-trained Transformers Are Large Time Series Models

COA-GPT: Generative Pre-trained Transformers for Accelerated Course of Action Development in Military Operations

Applying Large Language Models API to Issue Classification Problem

Plug-and-Play Transformer Modules for Test-Time Adaptation

Nonparametric Variational Regularisation of Pretrained Transformers

SecureBERT and LLAMA 2 Empowered Control Area Network Intrusion Detection and Classification

Finding and Editing Multi-Modal Neurons in Pre-Trained Transformers

Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion

Transformers are Provably Optimal In-context Estimators for Wireless Communications

Causal Interpretation of Self-Attention in Pre-Trained Transformers

ExPT: Synthetic Pretraining for Few-Shot Experimental Design

Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer

Selective Feature Adapter for Dense Vision Transformers