Mamba in Mamba
Mamba, a novel state-space model, is being explored as an efficient alternative to Transformers in various sequence modeling tasks. Current research focuses on adapting Mamba architectures for diverse applications, including computer vision, natural language processing, and signal processing, often comparing its performance and efficiency against established methods like Transformers and CNNs. This research aims to improve the speed and scalability of deep learning models while maintaining or exceeding performance, with implications for resource-constrained applications and large-scale deployments. The potential impact spans numerous fields, from medical image analysis and autonomous driving to personalized recommendations and drug discovery.
Papers
WILT: A Multi-Turn, Memorization-Robust Inductive Logic Benchmark for LLMs
Eryk Banatt, Jonathan Cheng, Skanda Vaidyanath, Tiffany Hwu
V2M: Visual 2-Dimensional Mamba for Image Representation Learning
Chengkun Wang, Wenzhao Zheng, Yuanhui Huang, Jie Zhou, Jiwen Lu
Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution
Junbo Qiao, Jincheng Liao, Wei Li, Yulun Zhang, Yong Guo, Yi Wen, Zhangxizi Qiu, Jiao Xie, Jie Hu, Shaohui Lin
Crafting Narrative Closures: Zero-Shot Learning with SSM Mamba for Short Story Ending Generation
Divyam Sharma, Divya Santhanam
Can Mamba Always Enjoy the "Free Lunch"?
Ruifeng Ren, Zhicong Li, Yong Liu
Mamba in Vision: A Comprehensive Survey of Techniques and Applications
Md Maklachur Rahman, Abdullah Aman Tutul, Ankur Nath, Lamyanba Laishram, Soon Ki Jung, Tracy Hammond