Transformer Megatron Decepticons
Transformer models are being extensively investigated for various sequence processing tasks, moving beyond natural language processing to encompass time series forecasting, image recognition, and scientific computing applications like solving partial differential equations. Current research focuses on improving efficiency (e.g., through mixed-precision quantization and optimized architectures), enhancing generalization capabilities (particularly to longer sequences), and understanding the underlying mechanisms of in-context learning. These advancements have significant implications for diverse fields, improving the accuracy and efficiency of numerous applications while simultaneously deepening our theoretical understanding of these powerful models.
Papers
End-to-end Piano Performance-MIDI to Score Conversion with Transformers
Tim Beyer, Angela Dai
AI Foundation Model for Heliophysics: Applications, Design, and Implementation
Sujit Roy, Talwinder Singh, Marcus Freitag, Johannes Schmude, Rohit Lal, Dinesha Hegde, Soumya Ranjan, Amy Lin, Vishal Gaur, Etienne Eben Vos, Rinki Ghosal, Badri Narayana Patro, Berkay Aydin, Nikolai Pogorelov, Juan Bernabe Moreno, Manil Maskey, Rahul Ramachandran
KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning
Junnan Liu, Qianren Mao, Weifeng Jiang, Jianxin Li
Introducing the Large Medical Model: State of the art healthcare cost and risk prediction with transformers trained on patient event sequences
Ricky Sahu, Eric Marriott, Ethan Siegel, David Wagner, Flore Uzan, Troy Yang, Asim Javed