Unseen Composition

Unseen composition research focuses on enabling artificial intelligence systems to recognize and understand novel combinations of visual concepts (objects and their states) not explicitly seen during training. Current efforts concentrate on leveraging large pre-trained vision-language models, employing techniques like prompt tuning, attention mechanisms, and contrastive learning to improve generalization to unseen compositions. These advancements are significant because they address a key limitation in AI's ability to reason about the world, paving the way for more robust and adaptable computer vision systems with applications in areas like image captioning and object recognition.

Papers