Indian Language
Research on Indian languages focuses on developing and evaluating natural language processing (NLP) models for the diverse linguistic landscape of India, addressing the challenges posed by low-resource languages and significant dialectal variation. Current efforts concentrate on adapting and fine-tuning multilingual transformer models, such as BERT and its variants, for tasks like machine translation, question answering, and sentiment analysis, alongside developing new benchmarks and datasets to facilitate robust evaluation. This work is crucial for bridging the digital divide, enabling wider access to technology and information in India, and advancing the broader field of multilingual NLP.
Papers
Is Word Error Rate a good evaluation metric for Speech Recognition in Indic Languages?
Priyanshi Shah, Harveen Singh Chadha, Anirudh Gupta, Ankur Dhuriya, Neeraj Chhimwal, Rishabh Gaur, Vivek Raghavan
Improving Speech Recognition for Indic Languages using Language Model
Ankur Dhuriya, Harveen Singh Chadha, Anirudh Gupta, Priyanshi Shah, Neeraj Chhimwal, Rishabh Gaur, Vivek Raghavan
Code Switched and Code Mixed Speech Recognition for Indic languages
Harveen Singh Chadha, Priyanshi Shah, Ankur Dhuriya, Neeraj Chhimwal, Anirudh Gupta, Vivek Raghavan
An Overview of Indian Language Datasets used for Text Summarization
Shagun Sinha, Girish Nath Jha