Masked Autoencoders
Masked autoencoders (MAEs) are a self-supervised learning technique that learns robust image representations by reconstructing masked portions of an image. Current research focuses on adapting MAEs for various data modalities (images, point clouds, audio, 3D data) and downstream tasks (classification, segmentation, object detection), often incorporating architectural enhancements like Vision Transformers and exploring different masking strategies beyond random masking to improve efficiency and performance. The resulting pre-trained models offer significant advantages in scenarios with limited labeled data, impacting fields like Earth observation, medical image analysis, and robotics through improved accuracy and reduced computational demands.
Papers
Masked Autoencoders for Low dose CT denoising
Dayang Wang, Yongshun Xu, Shuo Han, Hengyong Yu
Revisiting adapters with adversarial training
Sylvestre-Alvise Rebuffi, Francesco Croce, Sven Gowal
Denoising Masked AutoEncoders Help Robust Classification
Quanlin Wu, Hang Ye, Yuntian Gu, Huishuai Zhang, Liwei Wang, Di He