Transformer Based
Transformer-based models are revolutionizing various fields by leveraging self-attention mechanisms to capture long-range dependencies in sequential data, achieving state-of-the-art results in tasks ranging from natural language processing and image recognition to time series forecasting and robotic control. Current research focuses on improving efficiency (e.g., through quantization and optimized architectures), enhancing generalization capabilities, and addressing challenges like handling long sequences and endogeneity. These advancements are significantly impacting diverse scientific communities and practical applications, leading to more accurate, efficient, and robust models across numerous domains.
Papers
Evolutionary Neural Architecture Search for Transformer in Knowledge Tracing
Shangshang Yang, Xiaoshan Yu, Ye Tian, Xueming Yan, Haiping Ma, Xingyi Zhang
Linear attention is (maybe) all you need (to understand transformer optimization)
Kwangjun Ahn, Xiang Cheng, Minhak Song, Chulhee Yun, Ali Jadbabaie, Suvrit Sra
RBFormer: Improve Adversarial Robustness of Transformer by Robust Bias
Hao Cheng, Jinhao Duan, Hui Li, Lyutianyang Zhang, Jiahang Cao, Ping Wang, Jize Zhang, Kaidi Xu, Renjing Xu
Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation
Tao Pu, Tianshui Chen, Hefeng Wu, Yongyi Lu, Liang Lin
Transformer-based Image Compression with Variable Image Quality Objectives
Chia-Hao Kao, Yi-Hsin Chen, Cheng Chien, Wei-Chen Chiu, Wen-Hsiao Peng
AMPLIFY:Attention-based Mixup for Performance Improvement and Label Smoothing in Transformer
Leixin Yang, Yu Xiang
TrTr: A Versatile Pre-Trained Large Traffic Model based on Transformer for Capturing Trajectory Diversity in Vehicle Population
Ruyi Feng, Zhibin Li, Bowen Liu, Yan Ding
Vision Transformers for Computer Go
Amani Sagri, Tristan Cazenave, Jérôme Arjonilla, Abdallah Saffidine