Supervised Detector

Supervised object detection aims to train computer vision models to accurately identify and locate objects within images or videos, leveraging labeled datasets for training. Current research emphasizes improving generalization capabilities, particularly through techniques like pseudo-labeling for open-vocabulary detection and adapting models to unseen domains (sim2real). This involves exploring various architectures, including transformers and two-stage detectors, often incorporating multi-modal information (e.g., text and images) to enhance performance. Advances in this field are crucial for numerous applications, from autonomous driving and robotics to medical image analysis and assistive technologies.

Papers