Sinhala Text Classification

Sinhala text classification research focuses on developing and evaluating computational methods for automatically categorizing Sinhala language text, addressing challenges posed by limited resources and linguistic complexities. Current efforts concentrate on improving the accuracy of sentiment analysis using deep learning models like LSTMs and exploring pre-trained language models (e.g., RoBERTa, XLM-R) for various classification tasks, including hate speech and misinformation detection. These advancements are crucial for enhancing applications such as social media monitoring, improving accessibility for Sinhala speakers, and advancing natural language processing capabilities for low-resource languages.

Papers