Zero Shot Cross Lingual
Zero-shot cross-lingual natural language processing (NLP) aims to build models capable of understanding and generating text in multiple languages without requiring training data for each target language. Current research focuses on leveraging multilingual pre-trained language models (like mT5, XLM-R, and others), exploring techniques such as in-context learning, prompt engineering, and data augmentation strategies (including code-switching and pseudo-semantic data generation) to improve cross-lingual transfer. This field is crucial for bridging the language gap in NLP applications, enabling broader access to information and technology across diverse linguistic communities.
Papers
WebIE: Faithful and Robust Information Extraction on the Web
Chenxi Whitehouse, Clara Vania, Alham Fikri Aji, Christos Christodoulopoulos, Andrea Pierleoni
mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models
Peiqin Lin, Chengzhi Hu, Zheyu Zhang, André F. T. Martins, Hinrich Schütze
Translation and Fusion Improves Zero-shot Cross-lingual Information Extraction
Yang Chen, Vedaant Shah, Alan Ritter
CLASP: Few-Shot Cross-Lingual Data Augmentation for Semantic Parsing
Andy Rosenbaum, Saleh Soltan, Wael Hamza, Amir Saffari, Marco Damonte, Isabel Groves
CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual Labeled Sequence Translation
Jian Yang, Shaohan Huang, Shuming Ma, Yuwei Yin, Li Dong, Dongdong Zhang, Hongcheng Guo, Zhoujun Li, Furu Wei