Text Classification
Text classification aims to automatically categorize text into predefined categories, driven by the need for efficient and accurate information processing across diverse domains. Current research focuses on leveraging large language models (LLMs) like BERT and Llama 2, often enhanced with techniques such as fine-tuning, data augmentation, and active learning, alongside traditional machine learning methods like SVMs and XGBoost. These advancements are improving the accuracy and efficiency of text classification, with significant implications for applications ranging from medical diagnosis and financial analysis to social media monitoring and legal research. A key challenge remains ensuring model robustness, interpretability, and fairness, particularly when dealing with imbalanced datasets or noisy labels.
Papers
Estimating the Influence of Sequentially Correlated Literary Properties in Textual Classification: A Data-Centric Hypothesis-Testing Approach
Gideon Yoffe, Nachum Dershowitz, Ariel Vishne, Barak Sober
DISCO: DISCovering Overfittings as Causal Rules for Text Classification Models
Zijian Zhang, Vinay Setty, Yumeng Wang, Avishek Anand
Selecting Between BERT and GPT for Text Classification in Political Science Research
Yu Wang, Wen Qu, Xin Ye
Improve Meta-learning for Few-Shot Text Classification with All You Can Acquire from the Tasks
Xinyue Liu, Yunlong Gao, Linlin Zong, Bo Xu
A Multi-Task Text Classification Pipeline with Natural Language Explanations: A User-Centric Evaluation in Sentiment Analysis and Offensive Language Identification in Greek Tweets
Nikolaos Mylonas, Nikolaos Stylianou, Theodora Tsikrika, Stefanos Vrochidis, Ioannis Kompatsiaris