Paper ID: 2302.01497

Gradient Estimation for Unseen Domain Risk Minimization with Pre-Trained Models

Byounggyu Lew, Donghyun Son, Buru Chang

Domain generalization aims to build generalized models that perform well on unseen domains when only source domains are available for model optimization. Recent studies have shown that large-scale pre-trained models can enhance domain generalization by leveraging their generalization power. However, these pre-trained models lack target task-specific knowledge yet due to discrepancies between the pre-training objectives and the target task. Although the task-specific knowledge could be learned from source domains by fine-tuning, this hurts the generalization power of pre-trained models due to gradient bias toward the source domains. To alleviate this problem, we propose a new domain generalization method that estimates unobservable gradients that reduce potential risks in unseen domains using a large-scale pre-trained model. These estimated unobservable gradients allow the pre-trained model to learn task-specific knowledge further while preserving its generalization ability by relieving the gradient bias. Our experimental results show that our method outperforms baseline methods on DomainBed, a standard benchmark in domain generalization. We also provide extensive analyses to demonstrate that the pre-trained model can learn task-specific knowledge without sacrificing its generalization power.

Submitted: Feb 3, 2023