Masked Auto Encoder

Masked Autoencoders (MAEs) are a self-supervised learning technique that reconstructs masked portions of input data, learning robust and generalizable representations without relying on labeled datasets. Current research focuses on extending MAE's application beyond image data to diverse modalities like video, point clouds, and even textual data, often incorporating techniques like contrastive learning and geometrically informed masking strategies to improve efficiency and performance. This approach is proving highly impactful, enabling advancements in various fields including 3D scene generation, gaze estimation, anomaly detection, and autonomous driving by providing effective pre-trained models for downstream tasks.

Papers