Visually Rich Document
Visually rich documents (VRDs), containing diverse elements like text, images, tables, and charts, present a significant challenge for automated information extraction. Current research focuses on developing robust multimodal models, often leveraging transformer architectures and graph neural networks, to effectively integrate visual and textual information, addressing issues like layout understanding and reading order prediction to improve information extraction accuracy and efficiency. This field is crucial for advancing document understanding across various domains, impacting applications ranging from scientific literature analysis to business process automation.
Papers
November 2, 2024
October 27, 2024
October 14, 2024
October 2, 2024
September 29, 2024
September 18, 2024
August 31, 2024
August 27, 2024
August 19, 2024
August 8, 2024
August 2, 2024
June 12, 2024
May 21, 2024
April 23, 2024
April 19, 2024
April 16, 2024
April 10, 2024
April 8, 2024