Paper ID: 2404.08488

Thematic Analysis with Large Language Models: does it work with languages other than English? A targeted test in Italian

Stefano De Paoli

This paper proposes a test to perform Thematic Analysis (TA) with Large Language Model (LLM) on data which is in a different language than English. While there has been initial promising work on using pre-trained LLMs for TA on data in English, we lack any tests on whether these models can reasonably perform the same analysis with good quality in other language. In this paper a test will be proposed using an open access dataset of semi-structured interviews in Italian. The test shows that a pre-trained model can perform such a TA on the data, also using prompts in Italian. A comparative test shows the model capacity to produce themes which have a good resemblance with those produced independently by human researchers. The main implication of this study is that pre-trained LLMs may thus be suitable to support analysis in multilingual situations, so long as the language is supported by the model used.

Submitted: Apr 12, 2024