Versatile Image Segmentation

Versatile image segmentation aims to create models capable of segmenting objects in images across diverse domains and conditions, going beyond pre-defined classes. Current research focuses on enhancing existing models like Segment Anything Model (SAM) through techniques such as incorporating composable prompts, integrating language models like CLIP for zero-shot capabilities, and using SAM as a strong encoder within U-Net architectures. These advancements improve robustness to image perturbations and enable semantic, instance, and panoptic segmentation, impacting fields requiring accurate image analysis, such as medical imaging and autonomous systems.

Papers