Paper ID: 2009.12695

Techniques to Improve Q&A Accuracy with Transformer-based models on Large Complex Documents

Chejui Liao, Tabish Maniar, Sravanajyothi N, Anantha Sharma

This paper discusses the effectiveness of various text processing techniques, their combinations, and encodings to achieve a reduction of complexity and size in a given text corpus. The simplified text corpus is sent to BERT (or similar transformer based models) for question and answering and can produce more relevant responses to user queries. This paper takes a scientific approach to determine the benefits and effectiveness of various techniques and concludes a best-fit combination that produces a statistically significant improvement in accuracy.

Submitted: Sep 26, 2020