Monocular Video
Monocular video analysis focuses on reconstructing 3D scenes and objects, including humans, from single-camera video footage, aiming to overcome the inherent ambiguities of depth perception. Current research heavily utilizes neural radiance fields (NeRFs) and Gaussian splatting, often incorporating kinematic models and physics-based constraints to improve accuracy and realism, particularly for dynamic scenes and human motion capture. These advancements have significant implications for fields like virtual reality, animation, and robotics, enabling more efficient and realistic 3D content creation and scene understanding.
Papers
Diffusion Priors for Dynamic View Synthesis from Monocular Videos
Chaoyang Wang, Peiye Zhuang, Aliaksandr Siarohin, Junli Cao, Guocheng Qian, Hsin-Ying Lee, Sergey Tulyakov
CTNeRF: Cross-Time Transformer for Dynamic Neural Radiance Field from Monocular Video
Xingyu Miao, Yang Bai, Haoran Duan, Yawen Huang, Fan Wan, Yang Long, Yefeng Zheng
IntrinsicAvatar: Physically Based Inverse Rendering of Dynamic Humans from Monocular Videos via Explicit Ray Tracing
Shaofei Wang, Božidar Antić, Andreas Geiger, Siyu Tang
Reality's Canvas, Language's Brush: Crafting 3D Avatars from Monocular Video
Yuchen Rao, Eduardo Perez Pellitero, Benjamin Busam, Yiren Zhou, Jifei Song