Large Scale Text to Image
Large-scale text-to-image generation focuses on creating high-quality, realistic images from textual descriptions, primarily using diffusion models and masked image models. Current research emphasizes improving controllability, addressing issues like adversarial attacks and ensuring consistent image generation across different views or edits, often through techniques like prompt tuning, multi-modal input (audio, existing images), and novel attention mechanisms. This field is significant for its potential applications in various creative industries and scientific visualization, while also raising important ethical considerations regarding the generation of harmful content.
Papers
October 10, 2024
August 29, 2024
May 1, 2024
April 3, 2024
January 28, 2024
January 3, 2024
December 6, 2023
November 14, 2023
May 23, 2023
April 17, 2023
April 14, 2023
March 16, 2023