Paper ID: 2405.12462

Boosting X-formers with Structured Matrix for Long Sequence Time Series Forecasting

Zhicheng Zhang, Yong Wang, Shaoqi Tan, Bowei Xia, Yujie Luo

Transformer-based models for long sequence time series forecasting problems have gained significant attention due to their exceptional forecasting precision. However, the self-attention mechanism introduces challenges in terms of computational efficiency due to its quadratic time complexity. To address these issues, we propose a novel architectural framework that enhances Transformer models through the integration of Surrogate Attention Blocks (SAB) and Surrogate Feed-Forward Neural Network Blocks (SFB). They replace the self-attention and feed-forward layer by leveraging structured matrices that reduce both time and space complexity while maintaining the expressive power of the original self-attention mechanism and feed-forward network. The equivalence of this substitution is fully demonstrated. Extensive experiments on nine Transformer variants across five distinct time series tasks demonstrate an average performance improvement of 9.45%, alongside a 46% reduction in model size. These results confirm the efficacy of our surrogate-based approach in maintaining prediction accuracy while significantly boosting computational efficiency.

Submitted: May 21, 2024