Text Recognition
Text recognition, the automated extraction of text from images, aims to bridge the gap between visual and textual information. Current research focuses on improving accuracy and efficiency across diverse domains, including handwritten text, scene text (e.g., from signs or photographs), and documents, employing models like transformers and convolutional neural networks often enhanced by language models and self-supervised learning techniques. These advancements have significant implications for various applications, such as document digitization, automated data entry, and accessibility tools, while also driving innovation in related fields like visual document understanding and multimodal AI.
Papers
VIPTR: A Vision Permutable Extractor for Fast and Efficient Scene Text Recognition
Xianfu Cheng, Weixiao Zhou, Xiang Li, Xiaoming Chen, Jian Yang, Tongliang Li, Zhoujun Li
CMFN: Cross-Modal Fusion Network for Irregular Scene Text Recognition
Jinzhi Zheng, Ruyi Ji, Libo Zhang, Yanjun Wu, Chen Zhao
Text Region Multiple Information Perception Network for Scene Text Detection
Jinzhi Zheng, Libo Zhang, Yanjun Wu, Chen Zhao