Multimodal Prompt
Multimodal prompting leverages the combined power of different data modalities, such as text and images, to instruct artificial intelligence models, particularly large language and vision-language models. Current research focuses on developing efficient prompt tuning methods, addressing challenges like missing modalities and creating robust models for diverse tasks including image generation, segmentation, and robot control. This approach significantly improves model performance and generalization across various applications, particularly in scenarios requiring complex instructions or incomplete information, thereby advancing both fundamental AI research and practical applications in fields like healthcare and robotics.
Papers
September 24, 2024
September 7, 2024
June 28, 2024
June 26, 2024
June 11, 2024
May 28, 2024
April 2, 2024
March 26, 2024
December 17, 2023
December 4, 2023
October 14, 2023
October 4, 2023
July 5, 2023
October 6, 2022