Text Classification
Text classification aims to automatically categorize text into predefined categories, driven by the need for efficient and accurate information processing across diverse domains. Current research focuses on leveraging large language models (LLMs) like BERT and Llama 2, often enhanced with techniques such as fine-tuning, data augmentation, and active learning, alongside traditional machine learning methods like SVMs and XGBoost. These advancements are improving the accuracy and efficiency of text classification, with significant implications for applications ranging from medical diagnosis and financial analysis to social media monitoring and legal research. A key challenge remains ensuring model robustness, interpretability, and fairness, particularly when dealing with imbalanced datasets or noisy labels.
Papers
Out-of-Distribution Generalization in Text Classification: Past, Present, and Future
Linyi Yang, Yaoxiao Song, Xuan Ren, Chenyang Lyu, Yidong Wang, Lingqiao Liu, Jindong Wang, Jennifer Foster, Yue Zhang
Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data Augmentation
Danqing Luo, Chen Zhang, Jiahui Xu, Bin Wang, Yiming Chen, Yan Zhang, Haizhou Li
PIEClass: Weakly-Supervised Text Classification with Prompting and Noise-Robust Iterative Ensemble Training
Yunyi Zhang, Minhao Jiang, Yu Meng, Yu Zhang, Jiawei Han
Understanding and Mitigating Spurious Correlations in Text Classification with Neighborhood Analysis
Oscar Chew, Hsuan-Tien Lin, Kai-Wei Chang, Kuan-Hao Huang
Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks
Haoqi Zheng, Qihuang Zhong, Liang Ding, Zhiliang Tian, Xin Niu, Dongsheng Li, Dacheng Tao
A Benchmark on Extremely Weakly Supervised Text Classification: Reconcile Seed Matching and Prompting Approaches
Zihan Wang, Tianle Wang, Dheeraj Mekala, Jingbo Shang
A Comprehensive Survey of Sentence Representations: From the BERT Epoch to the ChatGPT Era and Beyond
Abhinav Ramesh Kashyap, Thanh-Tung Nguyen, Viktor Schlegel, Stefan Winkler, See-Kiong Ng, Soujanya Poria
Enhancing Pashto Text Classification using Language Processing Techniques for Single And Multi-Label Analysis
Mursal Dawodi, Jawid Ahmad Baktash
Employing Hybrid Deep Neural Networks on Dari Speech
Jawid Ahmad Baktash, Mursal Dawodi
Tuning Traditional Language Processing Approaches for Pashto Text Classification
Jawid Ahmad Baktash, Mursal Dawodi, Mohammad Zarif Joya, Nematullah Hassanzada