Transformer Based Model
Transformer-based models are a class of neural networks achieving state-of-the-art results across diverse fields by leveraging self-attention mechanisms to capture long-range dependencies in sequential data. Current research focuses on addressing limitations such as quadratic computational complexity for long sequences, leading to the development of alternative architectures like Mamba and modifications such as LoRA for efficient adaptation and inference. These advancements are significantly impacting various applications, from speech recognition and natural language processing to computer vision and time-series forecasting, by improving both accuracy and efficiency on resource-constrained devices.
Papers
ferret: a Framework for Benchmarking Explainers on Transformers
Giuseppe Attanasio, Eliana Pastor, Chiara Di Bonaventura, Debora Nozza
A Comparative Study on COVID-19 Fake News Detection Using Different Transformer Based Models
Sajib Kumar Saha Joy, Dibyo Fabian Dofadar, Riyo Hayat Khan, Md. Sabbir Ahmed, Rafeed Rahman
Sparse Mixture-of-Experts are Domain Generalizable Learners
Bo Li, Yifei Shen, Jingkang Yang, Yezhen Wang, Jiawei Ren, Tong Che, Jun Zhang, Ziwei Liu
Stabilizing Voltage in Power Distribution Networks via Multi-Agent Reinforcement Learning with Transformer
Minrui Wang, Mingxiao Feng, Wengang Zhou, Houqiang Li