Vision Assistant
Vision assistants are AI systems combining large language models (LLMs) with visual processing capabilities to perform a variety of tasks, aiming to provide helpful and informative interactions with users. Current research focuses on improving model efficiency, addressing issues like hallucinations and biases, and developing robust architectures (like LLaVA-style models and multimodal LLMs) for diverse applications, including medical diagnosis, activity assistance, and industrial inspection. These advancements hold significant potential for improving accessibility, automating complex tasks, and enhancing human-computer interaction across numerous domains.
Papers
October 28, 2024
October 19, 2024
September 20, 2024
August 4, 2024
July 27, 2024
July 21, 2024
July 8, 2024
June 28, 2024
June 20, 2024
June 17, 2024
June 13, 2024
May 14, 2024
January 11, 2024
December 30, 2023
December 18, 2023
September 29, 2023