Semantic Mask
Semantic masks, representing the pixel-wise classification of image regions into semantic categories, are central to numerous computer vision tasks, aiming to improve accuracy and efficiency in image understanding and generation. Current research focuses on leveraging semantic masks within various architectures, including diffusion models, masked autoencoders, and vision transformers, often in conjunction with techniques like contrastive learning and attention mechanisms to enhance performance in open-vocabulary segmentation, few-shot learning, and multi-concept image generation. This work is significant for advancing applications ranging from medical image analysis and remote sensing to text-to-image synthesis and robust object detection, particularly in scenarios with limited labeled data.
Papers
PriorPath: Coarse-To-Fine Approach for Controlled De-Novo Pathology Semantic Masks Generation
Nati Daniel, May Nathan, Eden Azeroual, Yael Fisher, Yonatan Savir
CutS3D: Cutting Semantics in 3D for 2D Unsupervised Instance Segmentation
Leon Sick, Dominik Engel, Sebastian Hartwig, Pedro Hermosilla, Timo Ropinski