Paper ID: 2404.13270

StrideNET: Swin Transformer for Terrain Recognition with Dynamic Roughness Extraction

Maitreya Shelare, Neha Shigvan, Atharva Satam, Poonam Sonar

The field of remote-sensing image classification has seen immense progress with the rise of convolutional neural networks, and more recently, through vision transformers. These models, with their self-attention mechanism, can effectively capture global relationships and long-range dependencies between the image patches, in contrast with traditional convolutional models. This paper introduces StrideNET, a dual-branch transformer-based model developed for terrain recognition and surface roughness extraction. The terrain recognition branch employs the Swin Transformer to classify varied terrains by leveraging its capability to capture both local and global features. Complementing this, the roughness extraction branch utilizes a statistical texture-feature analysis technique to dynamically extract important land surface properties such as roughness and slipperiness. The model was trained on a custom dataset consisting of four terrain classes - grassy, marshy, sandy, and rocky, and it outperforms benchmark CNN and transformer based models, by achieving an average test accuracy of over 99 % across all classes. The applications of this work extend to different domains such as environmental monitoring, land use and cover classification, disaster response and precision agriculture.

Submitted: Apr 20, 2024