Weakly Supervised Text Classification

Weakly supervised text classification aims to train accurate text classifiers using minimal labeled data, relying instead on readily available information like class names or seed words. Current research focuses on leveraging large language models (LLMs) for pseudo-label generation and refinement, often incorporating techniques like prompting, rule-based systems, and retrieval-augmented training to improve classification accuracy. This field is significant because it reduces the substantial cost and effort associated with manual data annotation, enabling efficient text classification in diverse applications where labeled data is scarce, such as healthcare and scientific literature analysis.

Papers