Promptable Segmentation
Promptable segmentation aims to segment images and videos using various prompts, such as points, boxes, or even text descriptions, guiding a model to isolate specific objects or regions of interest. Current research heavily utilizes transformer-based architectures, often adapting and extending the Segment Anything Model (SAM) and its successor, SAM 2, for applications ranging from remote sensing and medical imaging to 3D point cloud segmentation and video object tracking. This approach offers significant advantages in efficiency and generalization, particularly in scenarios with limited labeled data, impacting diverse fields by enabling faster and more accurate image analysis.
Papers
Medical SAM 2: Segment medical images as video via Segment Anything Model 2
Jiayuan Zhu, Abdullah Hamdi, Yunli Qi, Yueming Jin, Junde Wu
SAM 2: Segment Anything in Images and Videos
Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman Rädle, Chloe Rolland, Laura Gustafson, Eric Mintun, Junting Pan, Kalyan Vasudev Alwala, Nicolas Carion, Chao-Yuan Wu, Ross Girshick, Piotr Dollár, Christoph Feichtenhofer