Visual Text
Visual text processing focuses on understanding and generating text within images, aiming to bridge the gap between computer vision and natural language processing. Current research emphasizes improving the accuracy and legibility of text generated by diffusion models, addressing challenges like misspelling and the generation of unsafe content through novel jailbreak techniques, and developing robust evaluation metrics for generalizability. This field is crucial for advancing applications like document understanding, image captioning, and text-to-image synthesis, impacting areas ranging from accessibility to content moderation.
Papers
December 20, 2022
September 23, 2022
September 22, 2022
June 19, 2022
April 30, 2022