Optical Character Recognition
Optical Character Recognition (OCR) aims to automatically convert images of text into machine-readable text, facilitating efficient document processing and information extraction. Current research emphasizes improving OCR accuracy, particularly for challenging scenarios like historical documents, low-resolution images, and complex layouts, often employing transformer-based language models and convolutional neural networks for both character recognition and post-processing error correction. These advancements are crucial for digitizing historical archives, enhancing accessibility to information, and automating various tasks across diverse fields, from document management to scientific literature analysis.
Papers
A Permuted Autoregressive Approach to Word-Level Recognition for Urdu Digital Text
Ahmed Mustafa, Muhammad Tahir Rafique, Muhammad Ijlal Baig, Hasan Sajid, Muhammad Jawad Khan, Karam Dad Kallu
Enhancing License Plate Super-Resolution: A Layout-Aware and Character-Driven Approach
Valfride Nascimento, Rayson Laroca, Rafael O. Ribeiro, William Robson Schwartz, David Menotti
DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding
Wenhui Liao, Jiapeng Wang, Hongliang Li, Chengyu Wang, Jun Huang, Lianwen Jin