Language Level Performance Disparity

Language level performance disparity in natural language processing (NLP) refers to the uneven performance of NLP models across different languages and dialects, often favoring dominant languages like English. Current research focuses on mitigating this disparity through techniques like cross-lingual knowledge aggregation, teacher language selection and self-distillation in multilingual pretrained language models (mPLMs), and developing more inclusive benchmarks and evaluation metrics that account for dialectal variations. Addressing this issue is crucial for building equitable and universally accessible NLP systems, impacting fields like healthcare, where biased models can perpetuate existing inequalities.

Papers