Self Supervised Learning
Self-supervised learning (SSL) aims to train machine learning models using unlabeled data by formulating pretext tasks that encourage the model to learn useful representations. Current research focuses on improving SSL's performance and generalization across diverse data types (images, audio, graphs, point clouds) and downstream tasks, employing techniques like contrastive learning, masked autoencoders, and generative models within various architectures such as transformers and convolutional neural networks. These advancements are significant because they reduce the reliance on expensive and time-consuming data labeling, enabling the development of robust models for applications ranging from medical image analysis and speech recognition to geospatial AI and protein function prediction. The efficiency gains from SSL are also a key focus, with research exploring optimal model and data sizes for given computational budgets.
Papers
Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentation
Simone Rossetti, Damiano Zappia, Marta Sanzari, Marco Schaerf, Fiora Pirri
DUEL: Adaptive Duplicate Elimination on Working Memory for Self-Supervised Learning
Won-Seok Choi, Dong-Sig Han, Hyundo Lee, Junseok Park, Byoung-Tak Zhang
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition
Yujin Wang, Changli Tang, Ziyang Ma, Zhisheng Zheng, Xie Chen, Wei-Qiang Zhang
Facial Video-based Remote Physiological Measurement via Self-supervised Learning
Zijie Yue, Miaojing Shi, Shuai Ding
Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse Positive Pairs
Ruijie Tao, Kong Aik Lee, Rohan Kumar Das, Ville Hautamäki, Haizhou Li
Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models
Chaofan Ma, Yuhuan Yang, Yanfeng Wang, Ya Zhang, Weidi Xie
Training Autoregressive Speech Recognition Models with Limited in-domain Supervision
Chak-Fai Li, Francis Keith, William Hartmann, Matthew Snover