Visual Word Sense Disambiguation
Visual Word Sense Disambiguation (VWSD) aims to identify the correct meaning of an ambiguous word based on both textual context and accompanying images. Current research heavily utilizes multimodal approaches, integrating large language models (LLMs) and transformer-based architectures like CLIP, often enhanced by techniques such as prompt engineering, knowledge base integration (e.g., using glossaries or Wikipedia), and multimodal retrieval methods. These advancements improve the accuracy of image-text matching for resolving word ambiguity, with applications in areas such as improved image search and enhanced natural language understanding in multimedia contexts.
Papers
August 12, 2024
November 30, 2023
October 21, 2023
October 3, 2023
July 9, 2023
June 24, 2023
May 2, 2023