Transformer Megatron Decepticons
Transformer models are being extensively investigated for various sequence processing tasks, moving beyond natural language processing to encompass time series forecasting, image recognition, and scientific computing applications like solving partial differential equations. Current research focuses on improving efficiency (e.g., through mixed-precision quantization and optimized architectures), enhancing generalization capabilities (particularly to longer sequences), and understanding the underlying mechanisms of in-context learning. These advancements have significant implications for diverse fields, improving the accuracy and efficiency of numerous applications while simultaneously deepening our theoretical understanding of these powerful models.
Papers
Tabular data generation with tensor contraction layers and transformers
Aníbal Silva, André Restivo, Moisés Santos, Carlos Soares
Transformers Can Navigate Mazes With Multi-Step Prediction
Niklas Nolte, Ouail Kitouni, Adina Williams, Mike Rabbat, Mark Ibrahim
Megatron: Evasive Clean-Label Backdoor Attacks against Vision Transformer
Xueluan Gong, Bowei Tian, Meng Xue, Shuike Li, Yanjiao Chen, Qian Wang
Transformers Struggle to Learn to Search
Abulhair Saparov, Srushti Pawar, Shreyas Pimpalgaonkar, Nitish Joshi, Richard Yuanzhe Pang, Vishakh Padmakumar, Seyed Mehran Kazemi, Najoung Kim, He He
Multiclass Post-Earthquake Building Assessment Integrating Optical and SAR Satellite Imagery, Ground Motion, and Soil Data with Transformers
Deepank Singh, Vedhus Hoskere, Pietro Milillo
MetaFormer: High-fidelity Metalens Imaging via Aberration Correcting Transformers
Byeonghyeon Lee, Youbin Kim, Yongjae Jo, Hyunsu Kim, Hyemi Park, Yangkyu Kim, Debabrata Mandal, Praneeth Chakravarthula, Inki Kim, Eunbyung Park
Monet: Mixture of Monosemantic Experts for Transformers
Jungwoo Park, Young Jin Ahn, Kee-Eung Kim, Jaewoo Kang
Mapping using Transformers for Volumes -- Network for Super-Resolution with Long-Range Interactions
August Leander Høeg, Sophia W. Bardenfleth, Hans Martin Kjer, Tim B. Dyrby, Vedrana Andersen Dahl, Anders Dahl
MaterialPicker: Multi-Modal Material Generation with Diffusion Transformers
Xiaohe Ma, Valentin Deschaintre, Miloš Hašan, Fujun Luan, Kun Zhou, Hongzhi Wu, Yiwei Hu