Korean Language

Korean language processing is a rapidly evolving field focused on developing computational models to understand and generate Korean text. Current research emphasizes creating and improving large language models (LLMs) specifically tailored for Korean, addressing challenges posed by its unique morphology and writing system through techniques like efficient continual pretraining, instruction tuning, and linguistically informed subword tokenization. These advancements are improving performance on various NLP tasks, including machine translation (especially between North and South Korean dialects), named entity recognition, and question answering, and are contributing to the development of more effective Korean language learning tools.

Papers