Paper ID: 2112.06736

Roof-Transformer: Divided and Joined Understanding with Knowledge Enhancement

Wei-Lin Liao, Cheng-En Su, Wei-Yun Ma

Recent work on enhancing BERT-based language representation models with knowledge graphs (KGs) and knowledge bases (KBs) has yielded promising results on multiple NLP tasks. State-of-the-art approaches typically integrate the original input sentences with KG triples and feed the combined representation into a BERT model. However, as the sequence length of a BERT model is limited, such a framework supports little knowledge other than the original input sentences and is thus forced to discard some knowledge. This problem is especially severe for downstream tasks for which the input is a long paragraph or even a document, such as QA or reading comprehension tasks. We address this problem with Roof-Transformer, a model with two underlying BERTs and a fusion layer on top. One underlying BERT encodes the knowledge resources and the other one encodes the original input sentences, and the fusion layer integrates the two resultant encodings. Experimental results on a QA task and the GLUE benchmark attest the effectiveness of the proposed model.

Submitted: Dec 13, 2021