Multimodal Entity
Multimodal entity research focuses on understanding and processing entities represented across multiple data modalities, such as text and images, primarily aiming to improve entity linking, alignment, and recognition tasks. Current research emphasizes leveraging large language models (LLMs) and incorporating advanced techniques like optimal transport and graph neural networks to effectively fuse and reason over multimodal information, often addressing challenges such as missing or ambiguous data. This field is significant for advancing knowledge graph construction, multimodal information retrieval, and applications requiring robust understanding of entities within complex, real-world scenarios.
Papers
December 31, 2024
December 11, 2024
December 9, 2024
October 8, 2024
July 29, 2024
July 23, 2024
July 17, 2024
June 27, 2024
June 4, 2024
March 11, 2024
February 29, 2024
February 18, 2024
December 19, 2023
October 9, 2023
August 13, 2023
July 30, 2023
June 22, 2023
June 9, 2023
May 24, 2023
December 29, 2022