Paper ID: 2307.14850

Turkish Native Language Identification

Ahmet Yavuz Uluslu, Gerold Schneider

In this paper, we present the first application of Native Language Identification (NLI) for the Turkish language. NLI involves predicting the writer's first language by analysing their writing in different languages. While most NLI research has focused on English, our study extends its scope to Turkish. We used the recently constructed Turkish Learner Corpus and employed a combination of three syntactic features (CFG production rules, part-of-speech n-grams, and function words) with L2 texts to demonstrate their effectiveness in this task.

Submitted: Jul 27, 2023

Topics

Language Identification
Turkish Text
Native Language Identification
Learner Corpus

Links

arXiv PDF