Self Supervised Representation
Self-supervised representation learning aims to extract meaningful features from unlabeled data, overcoming the limitations of supervised learning's reliance on extensive annotated datasets. Current research focuses on developing novel pretext tasks and architectures, such as contrastive learning, masked autoencoders, and transformer-based models, to learn robust and generalizable representations across diverse data modalities (images, audio, video, EEG, etc.). This approach is significantly impacting various fields, enabling cost-effective and efficient model training for applications ranging from agricultural vision and medical image analysis to speech recognition and activity recognition. The resulting representations often demonstrate improved robustness to noise and distribution shifts compared to supervised methods.
Papers
High Fidelity Visualization of What Your Self-Supervised Representation Knows About
Florian Bordes, Randall Balestriero, Pascal Vincent
Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation
Yujia Zhang, Lai-Man Po, Xuyuan Xu, Mengyang Liu, Yexin Wang, Weifeng Ou, Yuzhi Zhao, Wing-Yin Yu