Monocular Depth
Monocular depth estimation aims to reconstruct three-dimensional scene depth from a single two-dimensional image, a challenging inverse problem due to inherent ambiguities. Current research focuses on improving accuracy and robustness, particularly in dynamic scenes and challenging conditions, using various deep learning architectures including convolutional neural networks, transformers, and diffusion models, often incorporating self-supervised learning and multi-view consistency constraints. These advancements have significant implications for robotics, autonomous driving, augmented reality, and 3D scene reconstruction, enabling more efficient and reliable perception systems.
Papers
Self-supervised Monocular Depth and Pose Estimation for Endoscopy with Generative Latent Priors
Ziang Xu, Bin Li, Yang Hu, Chenyu Zhang, James East, Sharib Ali, Jens Rittscher
DepthCues: Evaluating Monocular Depth Perception in Large Vision Models
Duolikun Danier, Mehmet Aygün, Changjian Li, Hakan Bilen, Oisin Mac Aodha
DecTrain: Deciding When to Train a DNN Online
Zih-Sing Fu, Soumya Sudhakar, Sertac Karaman, Vivienne Sze
RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions
Ziyao Zeng, Yangchao Wu, Hyoungseob Park, Daniel Wang, Fengyu Yang, Stefano Soatto, Dong Lao, Byung-Woo Hong, Alex Wong