Visual Representation Learning

Visual representation learning aims to create effective numerical representations of images, enabling computers to "understand" and process visual information. Current research heavily focuses on self-supervised learning methods, leveraging architectures like Vision Transformers (ViTs) and convolutional neural networks (CNNs), often incorporating contrastive learning, masked image modeling, and techniques like prompt tuning to improve representation quality. These advancements are driving progress in diverse applications, including image classification, object detection, medical image analysis, and robotic manipulation, by providing more robust and generalizable visual features.

Papers

September 18, 2023

Contrastive Learning for Enhancing Robust Scene Transfer in Vision-based Agile Flight
Jiaxu Xing, Leonard Bauersfeld, Yunlong Song, Chunwei Xing, Davide Scaramuzza
Contrastive Learning Zero Shot Visual Representation Learning Agile Flight Quadrotor Navigation

September 9, 2023

Visual Material Characteristics Learning for Circular Healthcare
Federico Zocco, Shahin Rahimifard
Vision Based Visual Representation Learning Circular Economy Material Map

August 8, 2023

Class-level Structural Relation Modelling and Smoothing for Visual Representation Learning
Zitan Chen, Zhuang Qi, Xiao Cao, Xiangxian Li, Xiangxu Meng, Lei Meng
Representation Learning Visual Representation Learning Smoothing Factor Relation Modeling

July 20, 2023

Learning Discriminative Visual-Text Representation for Polyp Re-Identification
Suncheng Xiang, Cang Liu, Sijia Du, Dahong Qian
Contrastive Learning Polyp Segmentation Visual Representation Learning Visual Language Video Polyp Segmentation

June 28, 2023

Semantic Positive Pairs for Enhancing Visual Representation Learning of Instance Discrimination methods
Mohammad Alkhalefi, Georgios Leontidis, Mingjun Zhong
Contrastive Learning Self Supervised Representation Learning Visual Representation Learning Instance Discrimination Self Supervised Instance

June 8, 2023

Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models
Tianzhe Chu, Shengbang Tong, Tianjiao Ding, Xili Dai, Benjamin David Haeffele, René Vidal, Yi Ma
Large Pre Trained Model General Principle Speech Based Age Visual Representation Learning ImageNet 1k Image Clustering

June 1, 2023

Affinity-based Attention in Self-supervised Transformers Predicts Dynamics of Object Grouping in Humans
Hossein Adeli, Seoyoung Ahn, Nikolaus Kriegeskorte, Gregory Zelinsky
Human Attention Arbitrary Object Real Human Visual Representation Learning Self Supervised Vision Transformer Self Supervised Transformer Reaction Time Feature Affinity

May 23, 2023

TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale
Ziyun Zeng, Yixiao Ge, Zhan Tong, Xihui Liu, Shu-Tao Xia, Ying Shan
Visual Analogue Scale Video Representation Visual Representation Learning Spatiotemporal Representation Robust Pre Tunable Deep

May 18, 2023

HMSN: Hyperbolic Self-Supervised Learning by Clustering with Ideal Prototypes
Aiden Durrant, Georgios Leontidis
Contrastive Learning Self Supervised Representation Learning Visual Representation Learning Hyperbolic Representation Hyperbolic Learning Simulated OSN Prototype Generation

April 13, 2023

Multi-Mode Online Knowledge Distillation for Self-Supervised Visual Representation Learning
Kaiyou Song, Jin Xie, Shan Zhang, Zimeng Luo
Knowledge Distillation Self Supervised Visual Representation Learning Self Supervised Visual Representation Online Knowledge Distillation

April 6, 2023

Synthetic Hard Negative Samples for Contrastive Learning
Hengkui Dong, Xianzhong Long, Yun Li, Lei Chen
Contrastive Learning Self Supervised Learning Visual Representation Learning Negative Sample

March 15, 2023

From Local Binary Patterns to Pixel Difference Networks for Efficient Visual Representation Learning
Zhuo Su, Matti Pietikäinen, Li Liu
Convolutional Neural Network Representation Learning Deep Model Visual Representation Learning Deep Vision Model Local Binary Pattern Pixel Difference

March 14, 2023

Manipulate by Seeing: Creating Manipulation Controllers from Pre-Trained Representations
Jianren Wang, Sudeep Dasari, Mohan Kumar Srirama, Shubham Tulsiani, Abhinav Gupta
Visual Representation Visual Representation Learning Pre Trained Representation Vision Encoders Robot Demonstration Generic Representation

February 24, 2023

Language-Driven Representation Learning for Robotics
Siddharth Karamcheti, Suraj Nair, Annie S. Chen, Thomas Kollar, Chelsea Finn, Dorsa Sadigh, Percy Liang
Robotics Domain Visual Representation Robot Learning Visual Representation Learning

February 23, 2023

Learning Visual Representations via Language-Guided Sampling
Mohamed El Banani, Karan Desai, Justin Johnson
Contrastive Learning Cross Modal Visual Representation Learning Text Contrastive Learning Language Sampling

January 29, 2023

The Influences of Color and Shape Features in Visual Contrastive Learning
Xiaoqi Zhuang
Contrastive Learning External Influence Visual Representation Learning Color Object Contrastive Representation Shape Feature Supervised Representation

January 28, 2023

A Closer Look at Few-shot Classification Again
Xu Luo, Hao Wu, Ji Zhang, Lianli Gao, Jing Xu, Jingkuan Song
Transfer Learning Pre Trained Model Shot Classification Glance Annotation Visual Representation Learning Training Algorithm

January 26, 2023

Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring
Ruyang Liu, Jingjia Huang, Ge Li, Jiashi Feng, Xinglong Wu, Thomas H. Li
Visual Representation Learning Video Recognition Video Text Retrieval Temporal Modeling CLIP Level Image to Video Transfer Learning

December 30, 2022

Improving Visual Representation Learning through Perceptual Understanding
Samyakh Tukra, Frederick Hoffman, Ken Chatfield
Supervised Autoencoder Masked Autoencoders Visual Representation Learning Perceptual Understanding Scene Level Pixel Reconstruction Multi Scale Training

December 20, 2022

Towards Unsupervised Visual Reasoning: Do Off-The-Shelf Features Know How to Reason?
Monika Wysoczańska, Tom Monnier, Tomasz Trzciński, David Picard
Visual Question Answering Visual Representation Unsupervised Setting Feature Wise Visual Reasoning Visual Representation Learning Reason Giving

Visual Representation Learning

Papers

Contrastive Learning for Enhancing Robust Scene Transfer in Vision-based Agile Flight

Visual Material Characteristics Learning for Circular Healthcare

Class-level Structural Relation Modelling and Smoothing for Visual Representation Learning

Learning Discriminative Visual-Text Representation for Polyp Re-Identification

Semantic Positive Pairs for Enhancing Visual Representation Learning of Instance Discrimination methods

Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models

Affinity-based Attention in Self-supervised Transformers Predicts Dynamics of Object Grouping in Humans

TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale

HMSN: Hyperbolic Self-Supervised Learning by Clustering with Ideal Prototypes

Multi-Mode Online Knowledge Distillation for Self-Supervised Visual Representation Learning

Synthetic Hard Negative Samples for Contrastive Learning

From Local Binary Patterns to Pixel Difference Networks for Efficient Visual Representation Learning

Manipulate by Seeing: Creating Manipulation Controllers from Pre-Trained Representations

Language-Driven Representation Learning for Robotics

Learning Visual Representations via Language-Guided Sampling

The Influences of Color and Shape Features in Visual Contrastive Learning

A Closer Look at Few-shot Classification Again

Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring

Improving Visual Representation Learning through Perceptual Understanding

Towards Unsupervised Visual Reasoning: Do Off-The-Shelf Features Know How to Reason?