Paper ID: 2403.16578
SegICL: A Multimodal In-context Learning Framework for Enhanced Segmentation in Medical Imaging
Lingdong Shen, Fangxin Shang, Xiaoshuang Huang, Yehui Yang, Haifeng Huang, Shiming Xiang
In the field of medical image segmentation, tackling Out-of-Distribution (OOD) segmentation tasks in a cost-effective manner remains a significant challenge. Universal segmentation models is a solution, which aim to generalize across the diverse modality of medical images, yet their effectiveness often diminishes when applied to OOD data modalities and tasks, requiring intricate fine-tuning of model for optimal performance. Few-shot learning segmentation methods are typically designed for specific modalities of data and cannot be directly transferred for use with another modality. Therefore, we introduce SegICL, a novel approach leveraging In-Context Learning (ICL) for image segmentation. Unlike existing methods, SegICL has the capability to employ text-guided segmentation and conduct in-context learning with a small set of image-mask pairs, eliminating the need for training the model from scratch or fine-tuning for OOD tasks (including OOD modality and dataset). Extensive experimental demonstrates a positive correlation between the number of shots and segmentation performance on OOD tasks. The performance of segmentation when provided thre-shots is approximately 1.5 times better than the performance in a zero-shot setting. This indicates that SegICL effectively address new segmentation tasks based on contextual information. Additionally, SegICL also exhibits comparable performance to mainstream models on OOD and in-distribution tasks. Our code will be released after paper review.
Submitted: Mar 25, 2024