Bounding Box Supervision

Bounding box supervision in computer vision aims to train object detection and segmentation models using only bounding box annotations, rather than the more expensive and time-consuming pixel-level annotations. Current research focuses on improving accuracy with inaccurate or loosely defined bounding boxes, employing techniques like Gaussian processes to generate pseudo-labels, self-distillation methods to refine predictions, and multiple instance learning strategies incorporating spatial information. This approach significantly reduces annotation costs, making advanced computer vision techniques more accessible for applications across diverse fields, including medical image analysis and video understanding.

Papers