Chinese Character
Chinese characters, the foundation of written Chinese, are a subject of ongoing research focusing on improving their representation, processing, and understanding within computational linguistics. Current efforts utilize large language models (LLMs) and various neural network architectures, including transformers and LSTMs, to tackle challenges such as automatic speech recognition (ASR), machine translation, and text generation in diverse Chinese dialects and historical scripts. These advancements have significant implications for language technology, enabling improved accessibility to Chinese language resources and facilitating cross-cultural communication and understanding.
Papers
MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth Information
Jianrong Wang, Yuchen Huo, Li Liu, Tianyi Xu, Qi Li, Sen Li
Effects of Tonal Coarticulation and Prosodic Positions on Tonal Contours of Low Rising Tones: In the Case of Xiamen Dialect
Yiying Hu, Hui Feng, Qinghua Zhao, Aijun Li
Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation
Yunjie Ji, Yan Gong, Yong Deng, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li
SikuGPT: A Generative Pre-trained Model for Intelligent Information Processing of Ancient Texts from the Perspective of Digital Humanities
Liu Chang, Wang Dongbo, Zhao Zhixiao, Hu Die, Wu Mengcheng, Lin Litao, Shen Si, Li Bin, Liu Jiangfeng, Zhang Hai, Zhao Lianzheng