Monocular Capture

Monocular capture aims to reconstruct 3D scenes or human motion from a single camera's input, offering a cost-effective alternative to multi-camera systems. Current research focuses on improving the accuracy and realism of these reconstructions, particularly for complex scenarios like human-object interaction and dynamic, non-rigid movements, often employing neural networks such as StyleGAN2 and diffusion models, along with inertial measurement units (IMUs) to overcome limitations of monocular vision. These advancements are significant for applications ranging from virtual avatar creation and animation to human-computer interaction and motion capture in fields like sports and healthcare.

Papers