Visual Representation Learning
Visual representation learning aims to create effective numerical representations of images, enabling computers to "understand" and process visual information. Current research heavily focuses on self-supervised learning methods, leveraging architectures like Vision Transformers (ViTs) and convolutional neural networks (CNNs), often incorporating contrastive learning, masked image modeling, and techniques like prompt tuning to improve representation quality. These advancements are driving progress in diverse applications, including image classification, object detection, medical image analysis, and robotic manipulation, by providing more robust and generalizable visual features.
Papers
November 16, 2024
October 21, 2024
October 14, 2024
October 8, 2024
September 13, 2024
September 5, 2024
August 31, 2024
July 24, 2024
July 12, 2024
July 9, 2024
June 27, 2024
June 24, 2024
June 12, 2024
June 11, 2024
May 28, 2024
May 27, 2024
May 26, 2024
March 27, 2024
March 26, 2024