Paper ID: 2307.01216

Discovering Patterns of Definitions and Methods from Scientific Documents

Yutian Sun, Hai Zhuge

The difficulties of automatic extraction of definitions and methods from scientific documents lie in two aspects: (1) the complexity and diversity of natural language texts, which requests an analysis method to support the discovery of pattern; and, (2) a complete definition or method represented by a scientific paper is usually distributed within text, therefore an effective approach should not only extract single sentence definitions and methods but also integrate the sentences to obtain a complete definition or method. This paper proposes an analysis method for discovering patterns of definition and method and uses the method to discover patterns of definition and method. Completeness of the patterns at the semantic level is guaranteed by a complete set of semantic relations that identify definitions and methods respectively. The completeness of the patterns at the syntactic and lexical levels is guaranteed by syntactic and lexical constraints. Experiments on the self-built dataset and two public definition datasets show that the discovered patterns are effective. The patterns can be used to extract definitions and methods from scientific documents and can be tailored or extended to suit other applications.

Submitted: Jul 1, 2023