Hierarchical Text Classification
Hierarchical text classification (HTC) aims to categorize text data according to a structured label hierarchy, assigning documents to multiple labels simultaneously while respecting their relationships. Current research emphasizes improving model efficiency and accuracy by exploring novel loss functions that explicitly model text-label alignment, lightweight architectures that avoid scaling issues with large hierarchies, and leveraging pre-trained language models through techniques like prompt tuning and in-context learning. HTC advancements have significant implications for various applications, including information retrieval, knowledge organization, and automated content tagging, improving the efficiency and accuracy of these tasks.
Papers
TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision
Yunyi Zhang, Ruozhen Yang, Xueqiang Xu, Rui Li, Jinfeng Xiao, Jiaming Shen, Jiawei Han
Utilizing Local Hierarchy with Adversarial Training for Hierarchical Text Classification
Zihan Wang, Peiyi Wang, Houfeng Wang