Natural Language Processing
Natural Language Processing (NLP) focuses on enabling computers to understand, interpret, and generate human language. Current research heavily emphasizes large language models (LLMs), exploring their capabilities in various tasks like question answering, text classification, and translation, while also addressing challenges such as bias, efficiency, and the need for better evaluation metrics. The field's significance lies in its potential to revolutionize numerous applications, from improving healthcare and education to enhancing information access and facilitating more effective human-computer interaction.
Papers
Transformer-Based Contextualized Language Models Joint with Neural Networks for Natural Language Inference in Vietnamese
Dat Van-Thanh Nguyen, Tin Van Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen
NLP Cluster Analysis of Common Core State Standards and NAEP Item Specifications
Gregory Camilli, Larry Suter
Can Artificial Intelligence Generate Quality Research Topics Reflecting Patient Concerns?
Jiyeong Kim, Michael L. Chen, Shawheen J. Rezaei, Mariana Ramirez-Posada, Jennifer L. Caswell-Jin, Allison W. Kurian, Fauzia Riaz, Kavita Y. Sarin, Jean Y. Tang, Steven M. Asch, Eleni Linos
Unveiling Topological Structures in Text: A Comprehensive Survey of Topological Data Analysis Applications in NLP
Adaku Uchendu, Thai Le
Multidimensional Byte Pair Encoding: Shortened Sequences for Improved Visual Data Generation
Tim Elsner, Paula Usinger, Julius Nehring-Wirxel, Gregor Kobsik, Victor Czech, Yanjiang He, Isaak Lim, Leif Kobbelt
Information Extraction from Clinical Notes: Are We Ready to Switch to Large Language Models?
Yan Hu, Xu Zuo, Yujia Zhou, Xueqing Peng, Jimin Huang, Vipina K. Keloth, Vincent J. Zhang, Ruey-Ling Weng, Qingyu Chen, Xiaoqian Jiang, Kirk E. Roberts, Hua Xu
Initial Nugget Evaluation Results for the TREC 2024 RAG Track with the AutoNuggetizer Framework
Ronak Pradeep, Nandan Thakur, Shivani Upadhyay, Daniel Campos, Nick Craswell, Jimmy Lin
A Practical Guide to Fine-tuning Language Models with Limited Data
Márton Szép, Daniel Rueckert, Rüdiger von Eisenhart-Rothe, Florian Hinterwimmer
DriveThru: a Document Extraction Platform and Benchmark Datasets for Indonesian Local Language Archives
MohammadRifqi Farhansyah, Muhammad Zuhdi Fikri Johari, Afinzaki Amiral, Ayu Purwarianti, Kumara Ari Yuana, Derry Tanti Wijaya
P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs
Yidan Zhang, Boyi Deng, Yu Wan, Baosong Yang, Haoran Wei, Fei Huang, Bowen Yu, Junyang Lin, Fei Huang, Jingren Zhou