Transformer Based Model
Transformer-based models are a class of neural networks achieving state-of-the-art results across diverse fields by leveraging self-attention mechanisms to capture long-range dependencies in sequential data. Current research focuses on addressing limitations such as quadratic computational complexity for long sequences, leading to the development of alternative architectures like Mamba and modifications such as LoRA for efficient adaptation and inference. These advancements are significantly impacting various applications, from speech recognition and natural language processing to computer vision and time-series forecasting, by improving both accuracy and efficiency on resource-constrained devices.
Papers
Advancing Italian Biomedical Information Extraction with Transformers-based Models: Methodological Insights and Multicenter Practical Application
Claudio Crema, Tommaso Mario Buonocore, Silvia Fostinelli, Enea Parimbelli, Federico Verde, Cira Fundarò, Marina Manera, Matteo Cotta Ramusino, Marco Capelli, Alfredo Costa, Giuliano Binetti, Riccardo Bellazzi, Alberto Redolfi
Extensive Evaluation of Transformer-based Architectures for Adverse Drug Events Extraction
Simone Scaboro, Beatrice Portellia, Emmanuele Chersoni, Enrico Santus, Giuseppe Serra
Does Long-Term Series Forecasting Need Complex Attention and Extra Long Inputs?
Daojun Liang, Haixia Zhang, Dongfeng Yuan, Xiaoyan Ma, Dongyang Li, Minggao Zhang
Calibration of Transformer-based Models for Identifying Stress and Depression in Social Media
Loukas Ilias, Spiros Mouzakitis, Dimitris Askounis
Improving Position Encoding of Transformers for Multivariate Time Series Classification
Navid Mohammadi Foumani, Chang Wei Tan, Geoffrey I. Webb, Mahsa Salehi