Submodular Mutual Information

Submodular mutual information (SMI) is a framework leveraging submodular optimization to select informative subsets of data, particularly targeting rare or underrepresented classes or data slices. Current research focuses on applying SMI to improve various machine learning tasks, including few-shot object detection, active learning (especially in cold-start scenarios and with imbalanced datasets), and handling out-of-distribution data. This approach offers significant advantages in efficiency and accuracy by strategically selecting data for labeling or model training, impacting fields like medical imaging, natural language processing, and autonomous driving.

Papers