Monocular Video
Monocular video analysis focuses on reconstructing 3D scenes and objects, including humans, from single-camera video footage, aiming to overcome the inherent ambiguities of depth perception. Current research heavily utilizes neural radiance fields (NeRFs) and Gaussian splatting, often incorporating kinematic models and physics-based constraints to improve accuracy and realism, particularly for dynamic scenes and human motion capture. These advancements have significant implications for fields like virtual reality, animation, and robotics, enabling more efficient and realistic 3D content creation and scene understanding.
Papers
Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes
Fabien Delattre, David Dirnfeld, Phat Nguyen, Stephen Scarano, Michael J. Jones, Pedro Miraldo, Erik Learned-Miller
Towards Robust and Smooth 3D Multi-Person Pose Estimation from Monocular Videos in the Wild
Sungchan Park, Eunyi You, Inhoe Lee, Joonseok Lee
Root Pose Decomposition Towards Generic Non-rigid 3D Reconstruction with Monocular Videos
Yikai Wang, Yinpeng Dong, Fuchun Sun, Xiao Yang
AltNeRF: Learning Robust Neural Radiance Field via Alternating Depth-Pose Optimization
Kun Wang, Zhiqiang Yan, Huang Tian, Zhenyu Zhang, Xiang Li, Jun Li, Jian Yang
Semantic-Human: Neural Rendering of Humans from Monocular Video with Human Parsing
Jie Zhang, Pengcheng Shi, Zaiwang Gu, Yiyang Zhou, Zhi Wang