Paper ID: 2311.06916
TSViT: A Time Series Vision Transformer for Fault Diagnosis
Shouhua Zhang, Jiehan Zhou, Xue Ma, Susanna Pirttikangas, Chunsheng Yang
Traditional fault diagnosis methods using Convolutional Neural Networks (CNNs) often struggle with capturing the temporal dynamics of vibration signals. To overcome this, the application of Transformer-based Vision Transformer (ViT) methods to fault diagnosis is gaining attraction. Nonetheless, these methods typically require extensive preprocessing, which increases computational complexity, potentially reducing the efficiency of the diagnosis process. Addressing this gap, this paper presents the Time Series Vision Transformer (TSViT), tailored for effective fault diagnosis. TSViT incorporates a convolutional layer to extract local features from vibration signals, alongside a transformer encoder to discern long-term temporal patterns. A thorough experimental comparison on three diverse datasets demonstrates TSViT's effectiveness and adaptability. Moreover, the paper delves into the influence of hyperparameter tuning on the model's performance, computational demand, and parameter count. Remarkably, TSViT achieves an unprecedented 100% average accuracy on two test sets and 99.99% on another, showcasing its exceptional diagnostic capabilities.
Submitted: Nov 12, 2023