Image Text
Image-text research focuses on developing models that understand and generate relationships between visual and textual information, aiming to bridge the gap between these modalities. Current research emphasizes improving the robustness and efficiency of vision-language models (VLMs) like CLIP, often through techniques such as prompt engineering, contrastive learning, and specialized datasets for domains like medicine and agriculture. This work is significant because it enables advancements in various applications, including medical image analysis, agricultural monitoring, and improved multimodal large language models (MLLMs), ultimately leading to more accurate and efficient AI systems.
Papers
June 10, 2024
June 7, 2024
May 27, 2024
May 20, 2024
April 27, 2024
April 23, 2024
April 5, 2024
April 3, 2024
March 22, 2024
March 11, 2024
March 10, 2024
March 5, 2024
February 17, 2024
February 5, 2024
January 29, 2024
January 26, 2024
January 9, 2024
January 4, 2024
January 3, 2024