Visual Representation Learning
Visual representation learning aims to create effective numerical representations of images, enabling computers to "understand" and process visual information. Current research heavily focuses on self-supervised learning methods, leveraging architectures like Vision Transformers (ViTs) and convolutional neural networks (CNNs), often incorporating contrastive learning, masked image modeling, and techniques like prompt tuning to improve representation quality. These advancements are driving progress in diverse applications, including image classification, object detection, medical image analysis, and robotic manipulation, by providing more robust and generalizable visual features.
Papers
April 26, 2022
April 10, 2022
April 1, 2022
January 23, 2022
January 22, 2022
January 4, 2022
December 23, 2021
December 14, 2021