Monocular Vision
Monocular vision research focuses on extracting three-dimensional information and understanding from single-camera images, aiming to overcome the inherent limitations of depth ambiguity. Current efforts concentrate on developing robust and efficient algorithms, often employing convolutional neural networks, transformers, and techniques like photometric SLAM and 3D Gaussian splatting, to achieve tasks such as depth estimation, pose estimation, and scene reconstruction. These advancements have significant implications for various fields, including autonomous driving, robotics, augmented reality, and medical applications, by enabling more sophisticated and cost-effective visual perception systems.
Papers
Skip Mamba Diffusion for Monocular 3D Semantic Scene Completion
Li Liang, Naveed Akhtar, Jordan Vice, Xiangrui Kong, Ajmal Saeed Mian
RMAvatar: Photorealistic Human Avatar Reconstruction from Monocular Video Based on Rectified Mesh-embedded Gaussians
Sen Peng, Weixing Xie, Zilong Wang, Xiaohu Guo, Zhonggui Chen, Baorong Yang, Xiao Dong
Monocular Event-Based Vision for Obstacle Avoidance with a Quadrotor
Anish Bhattacharya, Marco Cannici, Nishanth Rao, Yuezhan Tao, Vijay Kumar, Nikolai Matni, Davide Scaramuzza
Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3D Object Detection
Yifan Wang, Xiaochen Yang, Fanqi Pu, Qingmin Liao, Wenming Yang