Transformer Based
Transformer-based models are revolutionizing various fields by leveraging self-attention mechanisms to capture long-range dependencies in sequential data, achieving state-of-the-art results in tasks ranging from natural language processing and image recognition to time series forecasting and robotic control. Current research focuses on improving efficiency (e.g., through quantization and optimized architectures), enhancing generalization capabilities, and addressing challenges like handling long sequences and endogeneity. These advancements are significantly impacting diverse scientific communities and practical applications, leading to more accurate, efficient, and robust models across numerous domains.
Papers
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
Kohei Saijo, Gordon Wichern, François G. Germain, Zexu Pan, Jonathan Le Roux
Learning to Learn without Forgetting using Attention
Anna Vettoruzzo, Joaquin Vanschoren, Mohamed-Rafik Bouguelia, Thorsteinn Rögnvaldsson
Transformer-based Capacity Prediction for Lithium-ion Batteries with Data Augmentation
Gift Modekwe, Saif Al-Wahaibi, Qiugang Lu
Estimating Probability Densities with Transformer and Denoising Diffusion
Henry W. Leung, Jo Bovy, Joshua S. Speagle
Dissecting Multiplication in Transformers: Insights into LLMs
Luyu Qiu, Jianing Li, Chi Su, Chen Jason Zhang, Lei Chen