Transformer Megatron Decepticons
Transformer models are being extensively investigated for various sequence processing tasks, moving beyond natural language processing to encompass time series forecasting, image recognition, and scientific computing applications like solving partial differential equations. Current research focuses on improving efficiency (e.g., through mixed-precision quantization and optimized architectures), enhancing generalization capabilities (particularly to longer sequences), and understanding the underlying mechanisms of in-context learning. These advancements have significant implications for diverse fields, improving the accuracy and efficiency of numerous applications while simultaneously deepening our theoretical understanding of these powerful models.
Papers
Accelerating Trajectory Generation for Quadrotors Using Transformers
Srinath Tankasala, Mitch Pryor
Core-Periphery Principle Guided Redesign of Self-Attention in Transformers
Xiaowei Yu, Lu Zhang, Haixing Dai, Yanjun Lyu, Lin Zhao, Zihao Wu, David Liu, Tianming Liu, Dajiang Zhu
TransCODE: Co-design of Transformers and Accelerators for Efficient Training and Inference
Shikhar Tuli, Niraj K. Jha
DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality
Yuqing Wang, Yizhi Wang, Longhui Yu, Yuesheng Zhu, Zhouhui Lian
Prompt-Guided Transformers for End-to-End Open-Vocabulary Object Detection
Hwanjun Song, Jihwan Bang
Supervised Masked Knowledge Distillation for Few-Shot Transformers
Han Lin, Guangxing Han, Jiawei Ma, Shiyuan Huang, Xudong Lin, Shih-Fu Chang
Machine Learning for Brain Disorders: Transformers and Visual Transformers
Robin Courant, Maika Edberg, Nicolas Dufour, Vicky Kalogeiton
Transformers in Speech Processing: A Survey
Siddique Latif, Aun Zaidi, Heriberto Cuayahuitl, Fahad Shamshad, Moazzam Shoukat, Junaid Qadir
ModEFormer: Modality-Preserving Embedding for Audio-Video Synchronization using Transformers
Akash Gupta, Rohun Tripathi, Wondong Jang