Extreme Multilabel Classification

Extreme multilabel classification (XML) tackles the challenge of predicting many labels from an extremely large set for each data point. Current research focuses on improving scalability and accuracy, particularly for imbalanced datasets with rare ("tail") labels, employing techniques like label embeddings (often within deep learning frameworks), tree-based methods, and in-context learning to manage the vast label space. These advancements are driving progress in diverse applications, including recommendation systems (e.g., medical doctor referrals), product categorization, and automated medical coding, where efficient and accurate multi-label predictions are crucial.

Papers