Indigenous Language
Research on Indigenous languages is increasingly focused on leveraging artificial intelligence, particularly large language models and neural machine translation, to address the critical need for language preservation and revitalization. Current efforts concentrate on developing effective machine translation systems, automatic speech recognition tools, and language learning resources for these often low-resource languages, often employing techniques like transfer learning and fine-tuning of pre-trained multilingual models. This work highlights the ethical importance of community engagement in data collection and model development, aiming to empower Indigenous communities and ensure culturally sensitive technological advancements. The resulting tools have the potential to significantly impact language documentation, education, and cultural transmission.
Papers
Enhancing Translation for Indigenous Languages: Experiments with Multilingual Models
Atnafu Lambebo Tonja, Hellina Hailu Nigatu, Olga Kolesnikova, Grigori Sidorov, Alexander Gelbukh, Jugal Kalita
Parallel Corpus for Indigenous Language Translation: Spanish-Mazatec and Spanish-Mixtec
Atnafu Lambebo Tonja, Christian Maldonado-Sifuentes, David Alejandro Mendoza Castillo, Olga Kolesnikova, Noé Castro-Sánchez, Grigori Sidorov, Alexander Gelbukh