Vision Mamba
Vision Mamba, a family of models based on state space models (SSMs), aims to improve upon the limitations of convolutional neural networks (CNNs) and transformers in computer vision tasks. Current research focuses on enhancing Vision Mamba architectures through techniques like cross-layer token fusion, sparse connections, and stochastic regularization to improve training efficiency and scalability for various applications, including image classification, segmentation, and object detection. The linear computational complexity of Vision Mamba offers a significant advantage over transformers, particularly for high-resolution images and long sequences, making it a promising alternative for resource-constrained environments and large-scale datasets. Its success across diverse applications, from medical imaging to remote sensing, highlights its potential impact on various scientific fields and practical applications.
Papers
SliceMamba with Neural Architecture Search for Medical Image Segmentation
Chao Fan, Hongyuan Yu, Yan Huang, Liang Wang, Zhenghan Yang, Xibin Jia
GraphMamba: An Efficient Graph Structure Learning Vision Mamba for Hyperspectral Image Classification
Aitao Yang, Min Li, Yao Ding, Leyuan Fang, Yaoming Cai, Yujie He