Paper ID: 2205.12335

K-12BERT: BERT for K-12 education

Vasu Goel, Dhruv Sahnan, Venktesh V, Gaurav Sharma, Deep Dwivedi, Mukesh Mohania

Online education platforms are powered by various NLP pipelines, which utilize models like BERT to aid in content curation. Since the inception of the pre-trained language models like BERT, there have also been many efforts toward adapting these pre-trained models to specific domains. However, there has not been a model specifically adapted for the education domain (particularly K-12) across subjects to the best of our knowledge. In this work, we propose to train a language model on a corpus of data curated by us across multiple subjects from various sources for K-12 education. We also evaluate our model, K12-BERT, on downstream tasks like hierarchical taxonomy tagging.

Submitted: May 24, 2022

Topics

Language Model
Pre Trained Language Model
Ticket BERT
NLP Pipeline
Hierarchical Taxonomy

Links

arXiv PDF