Transformer Based Optical Character Recognition
Transformer-based optical character recognition (OCR) leverages the power of transformer neural networks to accurately transcribe text from images, aiming to improve accuracy and efficiency over traditional methods. Current research focuses on adapting these models to diverse languages, handling damaged or incomplete text, and mitigating vulnerabilities to adversarial attacks, with decoder-only architectures and improved initial embedding strategies showing promise. These advancements have significant implications for various fields, including historical document analysis, automated data entry, and improving the accessibility of visual information.
Papers
July 9, 2024
June 28, 2024
November 28, 2023
August 30, 2023