Document Layout Analysis

Document layout analysis (DLA) aims to automatically understand the structure of documents by identifying and classifying different regions (e.g., text, images, tables) and their relationships. Current research emphasizes improving model accuracy and robustness using various architectures, including transformers, graph neural networks, and object detection models like Mask R-CNN and YOLOv5, often incorporating techniques like knowledge distillation and self-supervised learning to address data scarcity. Advances in DLA are crucial for enabling efficient information extraction, document understanding, and accessibility, impacting fields ranging from digital humanities to automated document processing in various industries.

Papers