Transformer Megatron Decepticons
Transformer models are being extensively investigated for various sequence processing tasks, moving beyond natural language processing to encompass time series forecasting, image recognition, and scientific computing applications like solving partial differential equations. Current research focuses on improving efficiency (e.g., through mixed-precision quantization and optimized architectures), enhancing generalization capabilities (particularly to longer sequences), and understanding the underlying mechanisms of in-context learning. These advancements have significant implications for diverse fields, improving the accuracy and efficiency of numerous applications while simultaneously deepening our theoretical understanding of these powerful models.
Papers - Page 23
Emotion Recognition Using Transformers with Masked Learning
Seongjae Min, Junseok Yang, Sangjun Lim, Junyong Lee, Sangwon Lee, Sejoon LimSimple Hack for Transformers against Heavy Long-Text Classification on a Time- and Memory-Limited GPU Service
Mirza Alim Mutasodirin, Radityo Eko Prasojo, Achmad F. Abka, Hanif Rasyidi
Simulating Weighted Automata over Sequences and Trees with Transformers
Michael Rizvi, Maude Lizaire, Clara Lacroce, Guillaume RabusseauLaB-GATr: geometric algebra transformers for large biomedical surface and volume meshes
Julian Suk, Baris Imre, Jelmer M. WolterinkLookupFFN: Making Transformers Compute-lite for CPU inference
Zhanpeng Zeng, Michael Davies, Pranav Pulijala, Karthikeyan Sankaralingam, Vikas Singh