Scene Retrieval

Scene retrieval focuses on efficiently finding images or scenes matching a given query, whether that query is an image, text description, or a combination of both. Current research emphasizes bridging the semantic gap between low-level visual features and high-level semantic understanding, often employing techniques like pre-trained vision models, large language models, and joint embedding spaces to align visual and textual representations. This field is crucial for applications ranging from medical image analysis and autonomous driving to multimedia search and retrieval, improving access to and understanding of large visual datasets.

Papers