Visual Text Generation
Visual text generation focuses on creating images containing realistic and legible text, addressing challenges in aligning visual and textual modalities within generative models. Current research emphasizes improving the accuracy and coherence of generated text through techniques like incorporating glyph-level information, utilizing multimodal large language models to guide text placement and content, and employing diffusion models with refined control mechanisms. This field is significant for advancing multimodal AI, with applications ranging from enhanced image editing and generation to improved accessibility features in media and document creation.
Papers
October 24, 2024
October 14, 2024
October 6, 2024
July 23, 2024
July 19, 2024
July 16, 2024
June 25, 2024
June 24, 2024
November 6, 2023
May 29, 2023
May 26, 2023