Cross Lingual Cross Modal Retrieval
Cross-lingual cross-modal retrieval (CCR) aims to retrieve images or videos relevant to text queries in multiple languages, a crucial step towards truly multilingual information access. Current research focuses on improving the alignment of visual and textual representations, often leveraging large language models (LLMs) and contrastive learning techniques to overcome the challenges posed by noisy translations and the inherent semantic gap between modalities. These advancements are driven by the need for robust and efficient multilingual search and information retrieval systems, impacting fields like web search, multimedia indexing, and cross-cultural communication.
Papers
September 30, 2024
June 26, 2024
December 14, 2023
October 13, 2023
September 11, 2023