Supervised Counting
Supervised counting, encompassing tasks like crowd counting and individual counting in videos, aims to accurately estimate the number of objects in an image or video sequence, often with limited or weakly supervised annotations. Current research emphasizes developing robust models, including those based on transformers and convolutional neural networks, that effectively leverage various forms of supervision, ranging from fully labeled data to only global count information or binary rankings of image pairs. These advancements are significant because they reduce the reliance on expensive and time-consuming manual annotations, making accurate object counting more feasible for diverse applications, such as traffic monitoring and public safety.