Monocular Input
Monocular input, using a single camera to infer 3D information, is a core challenge in computer vision, aiming to reconstruct complex scenes and objects from limited visual data. Current research focuses on developing robust deep learning models, including transformers and convolutional neural networks, often incorporating additional sensor data like IMUs to improve accuracy and real-time performance. These advancements enable applications ranging from human avatar animation and hand tracking to 3D scene reconstruction for robotics and medical imaging (e.g., myopia screening), demonstrating the significance of monocular vision for diverse fields.
Papers
May 30, 2024
April 1, 2024
March 18, 2024
August 10, 2023
February 28, 2023
August 18, 2022