Compositional Zero Shot Learning
Compositional zero-shot learning (CZSL) aims to enable models to recognize novel combinations of visual features (e.g., "striped shirt") based on knowledge learned from seen combinations, without requiring explicit training examples for each new composition. Current research heavily utilizes large pre-trained vision-language models, often incorporating techniques like soft prompting, attention mechanisms, and disentangled feature learning to improve the generalization to unseen compositions. This field is significant because it addresses a key limitation of traditional zero-shot learning, paving the way for more robust and adaptable AI systems capable of handling real-world visual complexity in applications such as robotics and human-computer interaction.
Papers
Compositional Zero-shot Learning via Progressive Language-based Observations
Lin Li, Guikun Chen, Jun Xiao, Long Chen
HOMOE: A Memory-Based and Composition-Aware Framework for Zero-Shot Learning with Hopfield Network and Soft Mixture of Experts
Do Huu Dat, Po Yuan Mao, Tien Hoang Nguyen, Wray Buntine, Mohammed Bennamoun