Transformer Based
Transformer-based models are revolutionizing various fields by leveraging self-attention mechanisms to capture long-range dependencies in sequential data, achieving state-of-the-art results in tasks ranging from natural language processing and image recognition to time series forecasting and robotic control. Current research focuses on improving efficiency (e.g., through quantization and optimized architectures), enhancing generalization capabilities, and addressing challenges like handling long sequences and endogeneity. These advancements are significantly impacting diverse scientific communities and practical applications, leading to more accurate, efficient, and robust models across numerous domains.
Papers
Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer
Javier Ferrando, Gerard I. Gállego, Belen Alastruey, Carlos Escolano, Marta R. Costa-jussà
Simple Recurrence Improves Masked Language Models
Tao Lei, Ran Tian, Jasmijn Bastings, Ankur P. Parikh
The Diminishing Returns of Masked Language Models to Science
Zhi Hong, Aswathy Ajith, Gregory Pauloski, Eamon Duede, Kyle Chard, Ian Foster
SelfReformer: Self-Refined Network with Transformer for Salient Object Detection
Yi Ke Yun, Weisi Lin
Downstream Transformer Generation of Question-Answer Pairs with Preprocessing and Postprocessing Pipelines
Cheng Zhang, Hao Zhang, Jie Wang
Transkimmer: Transformer Learns to Layer-wise Skim
Yue Guan, Zhengyi Li, Jingwen Leng, Zhouhan Lin, Minyi Guo
Video Frame Interpolation with Transformer
Liying Lu, Ruizheng Wu, Huaijia Lin, Jiangbo Lu, Jiaya Jia