Document Intelligence
Document intelligence focuses on automatically extracting information and understanding the content of documents, particularly visually rich ones like forms and invoices. Current research emphasizes multimodal approaches, integrating textual, visual, and layout information using architectures like transformers and graph neural networks, often pre-trained on large datasets to improve performance on tasks such as information extraction and question answering. This field is crucial for automating document processing across various sectors, boosting efficiency and enabling new applications in areas like business, law, and medicine. The development of standardized evaluation metrics and publicly available datasets is also a significant area of ongoing work.
Papers
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
Zhiyuan Zhao, Hengrui Kang, Bin Wang, Conghui He
Evaluation of Attribution Bias in Retrieval-Augmented Large Language Models
Amin Abolghasemi, Leif Azzopardi, Seyyed Hadi Hashemi, Maarten de Rijke, Suzan Verberne