Paper ID: 2207.05008
A description of Turkish Discourse Bank 1.2 and an examination of common dependencies in Turkish discourse
Deniz Zeyrek, Mustafa Erolcan Er
We describe Turkish Discourse Bank 1.2, the latest version of a discourse corpus annotated for explicitly or implicitly conveyed discourse relations, their constitutive units, and senses in the Penn Discourse Treebank style. We present an evaluation of the recently added tokens and examine three commonly occurring dependency patterns that hold among the constitutive units of a pair of adjacent discourse relations, namely, shared arguments, full embedding and partial containment of a discourse relation. We present three major findings: (a) implicitly conveyed relations occur more often than explicitly conveyed relations in the data; (b) it is much more common for two adjacent implicit discourse relations to share an argument than for two adjacent explicit relations to do so; (c) both full embedding and partial containment of discourse relations are pervasive in the corpus, which can be partly due to subordinator connectives whose preposed subordinate clause tends to be selected together with the matrix clause rather than being selected alone. Finally, we briefly discuss the implications of our findings for Turkish discourse parsing.
Submitted: Jul 11, 2022