Text Image
Text-image research focuses on understanding and generating images containing text, aiming to improve the accuracy, realism, and diversity of such images. Current research heavily utilizes diffusion models, often enhanced with techniques like glyph-aware training and dual translation learning, to address challenges such as legible text generation, multi-concept synthesis, and cross-lingual capabilities. This field is significant for applications in combating misinformation (detecting text-image inconsistencies), improving scene text recognition, and enabling novel image editing and generation tasks, ultimately advancing both computer vision and natural language processing.
Papers
Content and Style Aware Generation of Text-line Images for Handwriting Recognition
Lei Kang, Pau Riba, Marçal Rusiñol, Alicia Fornés, Mauricio Villegas
How does fake news use a thumbnail? CLIP-based Multimodal Detection on the Unrepresentative News Image
Hyewon Choi, Yejun Yoon, Seunghyun Yoon, Kunwoo Park