Optical Character Recognition
Optical Character Recognition (OCR) aims to automatically convert images of text into machine-readable text, facilitating efficient document processing and information extraction. Current research emphasizes improving OCR accuracy, particularly for challenging scenarios like historical documents, low-resolution images, and complex layouts, often employing transformer-based language models and convolutional neural networks for both character recognition and post-processing error correction. These advancements are crucial for digitizing historical archives, enhancing accessibility to information, and automating various tasks across diverse fields, from document management to scientific literature analysis.
Papers
A Permuted Autoregressive Approach to Word-Level Recognition for Urdu Digital Text
Ahmed Mustafa, Muhammad Tahir Rafique, Muhammad Ijlal Baig, Hasan Sajid, Muhammad Jawad Khan, Karam Dad Kallu
Enhancing License Plate Super-Resolution: A Layout-Aware and Character-Driven Approach
Valfride Nascimento, Rayson Laroca, Rafael O. Ribeiro, William Robson Schwartz, David Menotti
DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding
Wenhui Liao, Jiapeng Wang, Hongliang Li, Chengyu Wang, Jun Huang, Lianwen Jin
Classification of Non-native Handwritten Characters Using Convolutional Neural Network
F. A. Mamun, S. A. H. Chowdhury, J. E. Giti, H. Sarker
CORU: Comprehensive Post-OCR Parsing and Receipt Understanding Dataset
Abdelrahman Abdallah, Mahmoud Abdalla, Mahmoud SalahEldin Kasem, Mohamed Mahmoud, Ibrahim Abdelhalim, Mohamed Elkasaby, Yasser ElBendary, Adam Jatowt