Transformer Megatron Decepticons
Transformer models are being extensively investigated for various sequence processing tasks, moving beyond natural language processing to encompass time series forecasting, image recognition, and scientific computing applications like solving partial differential equations. Current research focuses on improving efficiency (e.g., through mixed-precision quantization and optimized architectures), enhancing generalization capabilities (particularly to longer sequences), and understanding the underlying mechanisms of in-context learning. These advancements have significant implications for diverse fields, improving the accuracy and efficiency of numerous applications while simultaneously deepening our theoretical understanding of these powerful models.
Papers
Shifted-Windows Transformers for the Detection of Cerebral Aneurysms in Microsurgery
Jinfan Zhou, William Muirhead, Simon C. Williams, Danail Stoyanov, Hani J. Marcus, Evangelos B. Mazomenos
Jump to Conclusions: Short-Cutting Transformers With Linear Transformations
Alexander Yom Din, Taelin Karidi, Leshem Choshen, Mor Geva
Unifying Top-down and Bottom-up Scanpath Prediction Using Transformers
Zhibo Yang, Sounak Mondal, Seoyoung Ahn, Ruoyu Xue, Gregory Zelinsky, Minh Hoai, Dimitris Samaras
Multimodal Feature Extraction and Fusion for Emotional Reaction Intensity Estimation and Expression Classification in Videos with Transformers
Jia Li, Yin Chen, Xuesong Zhang, Jiantao Nie, Ziqiang Li, Yangchen Yu, Yan Zhang, Richang Hong, Meng Wang
SeqCo-DETR: Sequence Consistency Training for Self-Supervised Object Detection with Transformers
Guoqiang Jin, Fan Yang, Mingshan Sun, Ruyi Zhao, Yakun Liu, Wei Li, Tianpeng Bao, Liwei Wu, Xingyu Zeng, Rui Zhao
Attention-likelihood relationship in transformers
Valeria Ruscio, Valentino Maiorca, Fabrizio Silvestri