Text Classification
Text classification aims to automatically categorize text into predefined categories, driven by the need for efficient and accurate information processing across diverse domains. Current research focuses on leveraging large language models (LLMs) like BERT and Llama 2, often enhanced with techniques such as fine-tuning, data augmentation, and active learning, alongside traditional machine learning methods like SVMs and XGBoost. These advancements are improving the accuracy and efficiency of text classification, with significant implications for applications ranging from medical diagnosis and financial analysis to social media monitoring and legal research. A key challenge remains ensuring model robustness, interpretability, and fairness, particularly when dealing with imbalanced datasets or noisy labels.
Papers
Active Learning for Identifying Disaster-Related Tweets: A Comparison with Keyword Filtering and Generic Fine-Tuning
David Hanny, Sebastian Schmidt, Bernd Resch
AutoML-guided Fusion of Entity and LLM-based representations
Boshko Koloski, Senja Pollak, Roberto Navigli, Blaž Škrlj
A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification
Claudio M. V. de Andrade, Washington Cunha, Davi Reis, Adriana Silvina Pagano, Leonardo Rocha, Marcos André Gonçalves