Transformer Based
Transformer-based models are revolutionizing various fields by leveraging self-attention mechanisms to capture long-range dependencies in sequential data, achieving state-of-the-art results in tasks ranging from natural language processing and image recognition to time series forecasting and robotic control. Current research focuses on improving efficiency (e.g., through quantization and optimized architectures), enhancing generalization capabilities, and addressing challenges like handling long sequences and endogeneity. These advancements are significantly impacting diverse scientific communities and practical applications, leading to more accurate, efficient, and robust models across numerous domains.
Papers
Enhancing Transformer-Based Segmentation for Breast Cancer Diagnosis using Auto-Augmentation and Search Optimisation Techniques
Leon Hamnett, Mary Adewunmi, Modinat Abayomi, Kayode Raheem, Fahad Ahmed
Partially Randomizing Transformer Weights for Dialogue Response Diversity
Jing Yang Lee, Kong Aik Lee, Woon-Seng Gan
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy
Kirill Vishniakov, Zhiqiang Shen, Zhuang Liu
Improved Dense Nested Attention Network Based on Transformer for Infrared Small Target Detection
Chun Bao, Jie Cao, Yaqian Ning, Tianhua Zhao, Zhijun Li, Zechen Wang, Li Zhang, Qun Hao
Comparing Generalization in Learning with Limited Numbers of Exemplars: Transformer vs. RNN in Attractor Dynamics
Rui Fukushima, Jun Tani
High-resolution power equipment recognition based on improved self-attention
Siyi Zhang, Cheng Liu, Xiang Li, Xin Zhai, Zhen Wei, Sizhe Li, Xun Ma
GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values
Farnoosh Javadi, Walid Ahmed, Habib Hajimolahoseini, Foozhan Ataiefard, Mohammad Hassanpour, Saina Asani, Austin Wen, Omar Mohamed Awad, Kangling Liu, Yang Liu