Paper ID: 2306.03978
B\"{u}y\"{u}k dil modellerinin T\"{u}rk\c{c}e verisetleri ile e\u{g}itilmesi ve ince ayarlanmas\i
A. Taha Arslan
Large language models have advanced enormously, gained vast attraction and are having a phase of intensed research. Some of the developed models and training datasets have been made open-accessible. Hence these may be further fine-tuned with some techniques to obtain specialized models for specific tasks. When it comes to Turkish language, open-access models do not provide satisfactory coverage. This is also observed over published datasets. In this work, we propose some ideas to mitigate this issue: creating large Turkish datasets, training LLMs with these and fine-tuning pre-trained models with Turkish inputs. We report our findings on Turkish-based trainings with the problems encountered along the way. We conclude with outcomes of these experiments and propose ideas for further works. -- B\"uy\"uk dil modelleri inan{\i}lmaz \"ol\c{c}\"ude geli\c{s}mekte, b\"uy\"uk ilgi toplayarak ve \"uzerlerinde yo\u{g}un ara\c{s}tirmalarin yapildi\u{g}i bir d\"onemdedirler. Geli\c{s}tirilen modeller ve e\u{g}itimde kullanilan verisetlerinden bazilari a\c{c}ik eri\c{s}imli olarak sunulmaktadir. B\"oylece ince ayarlama teknikleri uygulayarak \"ozelle\c{s}mi\c{s} g\"orevler i\c{c}in \c{c}ali\c{s}abilir modeller elde edilmektedir. T\"urk\c{c}e s\"oz konusu oldu\u{g}unda bu modellerinin kapsayicili\u{g}i yeterli d\"uzeyde de\u{g}ildir. Bu durum, yayimlanan verisetlerinde de g\"ozlemlenebilir. Bunu a\c{s}manin yollari T\"urk\c{c}e i\c{c}erikli b\"uy\"uk verisetlerinin olu\c{s}turulmasi, b\"uy\"uk dil modellerinin bunlarla e\u{g}itilmesi ve \"onceden e\u{g}itilmi\c{s} modellerin T\"urk\c{c}e girdilerle ince ayarlanmalari olabilir. Bu \c{c}ali\c{s}mada a\c{c}ik eri\c{s}imli dil modelleri ve verisetleri \"uzerinde durulmakta ve T\"urk\c{c}e temelli bazi deneyler, kar\c{s}ila\c{s}ilan sorunlar ve sonu\c{c}lar irdelenmektedir.
Submitted: Jun 6, 2023