Paper ID: 2402.14976

Unsupervised Domain Adaptation within Deep Foundation Latent Spaces

Dmitry Kangin, Plamen Angelov

The vision transformer-based foundation models, such as ViT or Dino-V2, are aimed at solving problems with little or no finetuning of features. Using a setting of prototypical networks, we analyse to what extent such foundation models can solve unsupervised domain adaptation without finetuning over the source or target domain. Through quantitative analysis, as well as qualitative interpretations of decision making, we demonstrate that the suggested method can improve upon existing baselines, as well as showcase the limitations of such approach yet to be solved.

Submitted: Feb 22, 2024