Large Scale Chinese

Research on large-scale Chinese language processing focuses on developing and evaluating models for various NLP tasks using extensive Chinese datasets. Current efforts concentrate on improving compositional generalization, leveraging large language models (LLMs) for tasks like automatic speech recognition and event extraction, and creating high-quality benchmarks for evaluating model performance across diverse domains, including text-to-SQL, passage retrieval, and scientific literature analysis. This work is crucial for advancing the state-of-the-art in Chinese NLP and enabling broader applications in areas such as information retrieval, financial analysis, and machine translation.

Papers