Paper ID: 2210.12662

Improving Chinese Named Entity Recognition by Search Engine Augmentation

Qinghua Mao, Jiatong Li, Kui Meng

Compared with English, Chinese suffers from more grammatical ambiguities, like fuzzy word boundaries and polysemous words. In this case, contextual information is not sufficient to support Chinese named entity recognition (NER), especially for rare and emerging named entities. Semantic augmentation using external knowledge is a potential way to alleviate this problem, while how to obtain and leverage external knowledge for the NER task remains a challenge. In this paper, we propose a neural-based approach to perform semantic augmentation using external knowledge from search engine for Chinese NER. In particular, a multi-channel semantic fusion model is adopted to generate the augmented input representations, which aggregates external related texts retrieved from the search engine. Experiments have shown the superiority of our model across 4 NER datasets, including formal and social media language contexts, which further prove the effectiveness of our approach.

Submitted: Oct 23, 2022