Document Image Classification

Document image classification aims to automatically categorize scanned documents based on their visual content and layout, facilitating efficient processing of large document archives. Current research emphasizes improving model accuracy and efficiency, exploring multimodal approaches that integrate visual and textual information, and addressing challenges like zero-shot learning and out-of-distribution generalization using techniques such as graph neural networks and large language models. This field is crucial for various applications, including digital library organization, legal document processing, and automated data extraction, driving advancements in both computer vision and natural language processing.

Papers