Transformer Megatron Decepticons
Transformer models are being extensively investigated for various sequence processing tasks, moving beyond natural language processing to encompass time series forecasting, image recognition, and scientific computing applications like solving partial differential equations. Current research focuses on improving efficiency (e.g., through mixed-precision quantization and optimized architectures), enhancing generalization capabilities (particularly to longer sequences), and understanding the underlying mechanisms of in-context learning. These advancements have significant implications for diverse fields, improving the accuracy and efficiency of numerous applications while simultaneously deepening our theoretical understanding of these powerful models.
Papers
VISIT: Visualizing and Interpreting the Semantic Information Flow of Transformers
Shahar Katz, Yonatan Belinkov
Investigating the Role of Feed-Forward Networks in Transformers Using Parallel Attention and Feed-Forward Net Design
Shashank Sonkar, Richard G. Baraniuk
Teaching Probabilistic Logical Reasoning to Transformers
Aliakbar Nafar, Kristen Brent Venable, Parisa Kordjamshidi
TADA: Efficient Task-Agnostic Domain Adaptation for Transformers
Chia-Chien Hung, Lukas Lange, Jannik Strötgen
On Dataset Transferability in Active Learning for Transformers
Fran Jelenić, Josip Jukić, Nina Drobac, Jan Šnajder
Advising OpenMP Parallelization via a Graph-Based Approach with Transformers
Tal Kadosh, Nadav Schneider, Niranjan Hasabnis, Timothy Mattson, Yuval Pinter, Gal Oren
MPI-rical: Data-Driven MPI Distributed Parallelism Assistance with Transformers
Nadav Schneider, Tal Kadosh, Niranjan Hasabnis, Timothy Mattson, Yuval Pinter, Gal Oren
Transformers for CT Reconstruction From Monoplanar and Biplanar Radiographs
Firas Khader, Gustav Müller-Franzes, Tianyu Han, Sven Nebelung, Christiane Kuhl, Johannes Stegmaier, Daniel Truhn
Cascaded Cross-Attention Networks for Data-Efficient Whole-Slide Image Classification Using Transformers
Firas Khader, Jakob Nikolas Kather, Tianyu Han, Sven Nebelung, Christiane Kuhl, Johannes Stegmaier, Daniel Truhn
IUST_NLP at SemEval-2023 Task 10: Explainable Detecting Sexism with Transformers and Task-adaptive Pretraining
Hadiseh Mahmoudi
Poses as Queries: Image-to-LiDAR Map Localization with Transformers
Jinyu Miao, Kun Jiang, Yunlong Wang, Tuopu Wen, Zhongyang Xiao, Zheng Fu, Mengmeng Yang, Maolin Liu, Diange Yang
Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
Zhanpeng Zeng, Cole Hawkins, Mingyi Hong, Aston Zhang, Nikolaos Pappas, Vikas Singh, Shuai Zheng