Paper ID: 2203.03819
Table Structure Recognition with Conditional Attention
Bin Xiao, Murat Simsek, Burak Kantarci, Ala Abu Alkheir
Tabular data in digital documents is widely used to express compact and important information for readers. However, it is challenging to parse tables from unstructured digital documents, such as PDFs and images, into machine-readable format because of the complexity of table structures and the missing of meta-information. Table Structure Recognition (TSR) problem aims to recognize the structure of a table and transform the unstructured tables into a structured and machine-readable format so that the tabular data can be further analysed by the down-stream tasks, such as semantic modeling and information retrieval. In this study, we hypothesize that a complicated table structure can be represented by a graph whose vertices and edges represent the cells and association between cells, respectively. Then we define the table structure recognition problem as a cell association classification problem and propose a conditional attention network (CATT-Net). The experimental results demonstrate the superiority of our proposed method over the state-of-the-art methods on various datasets. Besides, we investigate whether the alignment of a cell bounding box or a text-focused approach has more impact on the model performance. Due to the lack of public dataset annotations based on these two approaches, we further annotate the ICDAR2013 dataset providing both types of bounding boxes, which can be a new benchmark dataset for evaluating the methods in this field. Experimental results show that the alignment of a cell bounding box can help improve the Micro-averaged F1 score from 0.915 to 0.963, and the Macro-average F1 score from 0.787 to 0.923.
Submitted: Mar 8, 2022