Marathi Corpus
Marathi corpus research focuses on developing and expanding linguistic resources for the Marathi language, a low-resource language with limited existing NLP tools. Current efforts center on creating large, diverse datasets for various tasks (e.g., text classification, question answering, sentiment analysis) and training effective Marathi language models, primarily leveraging BERT-based architectures and techniques like knowledge distillation and pruning to improve efficiency. This work is crucial for advancing Marathi NLP capabilities, enabling the development of practical applications and contributing significantly to the broader field of low-resource language processing.
Papers
October 11, 2024
September 21, 2024
April 28, 2024
November 5, 2023
September 27, 2023
June 24, 2023
November 21, 2022
May 29, 2022