Paper ID: 2410.19748

C^2DA: Contrastive and Context-aware Domain Adaptive Semantic Segmentation

Md. Al-Masrur Khan, Zheng Chen, Lantao Liu

Unsupervised domain adaptive semantic segmentation (UDA-SS) aims to train a model on the source domain data (e.g., synthetic) and adapt the model to predict target domain data (e.g., real-world) without accessing target annotation data. Most existing UDA-SS methods only focus on inter-domain knowledge to mitigate the data-shift problem. However, learning the inherent structure of the images and exploring the intrinsic pixel distribution of both domains are ignored, which prevents the UDA-SS methods from producing satisfactory performance like supervised learning. Moreover, incorporating contextual knowledge is also often overlooked. Considering these issues, in this work, we propose a UDA-SS framework that learns both intra-domain and context-aware knowledge. To learn the intra-domain knowledge, we incorporate contrastive loss in both domains, which pulls pixels of similar classes together and pushes the rest away, facilitating intra-image-pixel-wise correlations. To learn context-aware knowledge, we modify the mixing technique by leveraging contextual dependency among the classes. Moreover, we adapt the Mask Image Modeling (MIM) technique to properly use context clues for robust visual recognition, using limited information about the masked images. Comprehensive experiments validate that our proposed method improves the state-of-the-art UDA-SS methods by a margin of 0.51% mIoU and 0.54% mIoU in the adaptation of GTA-V->Cityscapes and Synthia->Cityscapes, respectively. We open-source our C2DA code. Code link: github.com/Masrur02/C-Squared-DA

Submitted: Oct 10, 2024