Transformer Network
Transformer networks are a class of deep learning models designed to process sequential data by leveraging self-attention mechanisms, enabling the capture of long-range dependencies within the data. Current research focuses on optimizing transformer architectures for efficiency and generalization, including exploring sparse connections, pruning techniques, and specialized hardware acceleration, as well as adapting them for diverse applications beyond natural language processing, such as image analysis, time series prediction, and signal processing. This versatility makes transformers a powerful tool across numerous scientific fields and practical applications, driving advancements in areas ranging from medical image analysis to autonomous driving and energy management.
Papers
Enhancing Crop Segmentation in Satellite Image Time Series with Transformer Networks
Ignazio Gallo, Mattia Gatti, Nicola Landro, Christian Loschiavo, Mirco Boschetti, Riccardo La Grassa
Command-line Risk Classification using Transformer-based Neural Architectures
Paolo Notaro, Soroush Haeri, Jorge Cardoso, Michael Gerndt
Anomaly Resilient Temporal QoS Prediction using Hypergraph Convoluted Transformer Network
Suraj Kumar, Soumi Chattopadhyay, Chandranath Adak
Mechanisms of Symbol Processing for In-Context Learning in Transformer Networks
Paul Smolensky, Roland Fernandez, Zhenghao Herbert Zhou, Mattia Opper, Jianfeng Gao