Transformer Megatron Decepticons
Transformer models are being extensively investigated for various sequence processing tasks, moving beyond natural language processing to encompass time series forecasting, image recognition, and scientific computing applications like solving partial differential equations. Current research focuses on improving efficiency (e.g., through mixed-precision quantization and optimized architectures), enhancing generalization capabilities (particularly to longer sequences), and understanding the underlying mechanisms of in-context learning. These advancements have significant implications for diverse fields, improving the accuracy and efficiency of numerous applications while simultaneously deepening our theoretical understanding of these powerful models.
Papers - Page 13
Metadata Matters for Time Series: Informative Forecasting with Transformers
Jiaxiang Dong, Haixu Wu, Yuxuan Wang, Li Zhang, Jianmin Wang, Mingsheng LongResource-aware Mixed-precision Quantization for Enhancing Deployability of Transformers for Time-series Forecasting on Embedded FPGAs
Tianheng Ling, Chao Qian, Gregor Schiele
Towards Understanding the Universality of Transformers for Next-Token Prediction
Michael E. Sander, Gabriel PeyréCan Transformers Learn n-gram Language Models?
Anej Svete, Nadav Borenstein, Mike Zhou, Isabelle Augenstein, Ryan CotterellDeconstructing Recurrence, Attention, and Gating: Investigating the transferability of Transformers and Gated Recurrent Neural Networks in forecasting of dynamical systems
Hunter Heidenreich, Pantelis R. Vlachas, etros KoumoutsakosHATFormer: Historic Handwritten Arabic Text Recognition with Transformers
Adrian Chan, Anupam Mijar, Mehreen Saeed, Chau-Wai Wong, Akram KhaterA Formal Framework for Understanding Length Generalization in Transformers
Xinting Huang, Andy Yang, Satwik Bhattamishra, Yash Sarrof, Andreas Krebs, Hattie Zhou, Preetum Nakkiran, Michael HahnMamba Neural Operator: Who Wins? Transformers vs. State-Space Models for PDEs
Chun-Wun Cheng, Jiahao Huang, Yi Zhang, Guang Yang, Carola-Bibiane Schönlieb, Angelica I Aviles-Rivero
End-to-end Piano Performance-MIDI to Score Conversion with Transformers
Tim Beyer, Angela DaiAI Foundation Model for Heliophysics: Applications, Design, and Implementation
Sujit Roy, Talwinder Singh, Marcus Freitag, Johannes Schmude, Rohit Lal, Dinesha Hegde, Soumya Ranjan, Amy Lin, Vishal Gaur, Etienne Eben Vos+7