MLLM Training
Multimodal large language model (MLLM) training focuses on developing AI systems capable of understanding and generating content across multiple modalities like text, images, and video. Current research emphasizes improving MLLM efficiency through techniques like knowledge distillation and model compression, as well as enhancing their performance on specific tasks such as visual question answering and embodied agent control, often using instruction tuning and preference learning. This field is significant due to the potential of MLLMs to revolutionize various applications, from healthcare diagnostics to robotics, by enabling more human-like interaction with complex data.
Papers
August 3, 2024
July 1, 2024
June 27, 2024
June 18, 2024
June 17, 2024
May 17, 2024
April 19, 2024
April 14, 2024
April 11, 2024
March 25, 2024
February 22, 2024
February 20, 2024
February 7, 2024
January 26, 2024
January 25, 2024
December 19, 2023
November 5, 2023
October 3, 2023
September 19, 2023