Hierarchical Text Classification

Hierarchical text classification (HTC) aims to categorize text data according to a structured label hierarchy, assigning documents to multiple labels simultaneously while respecting their relationships. Current research emphasizes improving model efficiency and accuracy by exploring novel loss functions that explicitly model text-label alignment, lightweight architectures that avoid scaling issues with large hierarchies, and leveraging pre-trained language models through techniques like prompt tuning and in-context learning. HTC advancements have significant implications for various applications, including information retrieval, knowledge organization, and automated content tagging, improving the efficiency and accuracy of these tasks.

Papers