Monocular Video
Monocular video analysis focuses on reconstructing 3D scenes and objects, including humans, from single-camera video footage, aiming to overcome the inherent ambiguities of depth perception. Current research heavily utilizes neural radiance fields (NeRFs) and Gaussian splatting, often incorporating kinematic models and physics-based constraints to improve accuracy and realism, particularly for dynamic scenes and human motion capture. These advancements have significant implications for fields like virtual reality, animation, and robotics, enabling more efficient and realistic 3D content creation and scene understanding.
Papers
VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo Alignment
Wenyan Cong, Kevin Wang, Jiahui Lei, Colton Stearns, Yuanhao Cai, Dilin Wang, Rakesh Ranjan, Matt Feiszli, Leonidas Guibas, Zhangyang Wang, Weiyao Wang, Zhiwen Fan
AR4D: Autoregressive 4D Generation from Monocular Videos
Hanxin Zhu, Tianyu He, Xiqian Yu, Junliang Guo, Zhibo Chen, Jiang Bian
D$^3$-Human: Dynamic Disentangled Digital Human from Monocular Video
Honghu Chen, Bo Peng, Yunfan Tao, Juyong Zhang
GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion
Jiapeng Tang, Davide Davoli, Tobias Kirschstein, Liam Schoneveld, Matthias Niessner
SplineGS: Robust Motion-Adaptive Spline for Real-Time Dynamic 3D Gaussians from Monocular Video
Jongmin Park, Minh-Quan Viet Bui, Juan Luis Gonzalez Bello, Jaeho Moon, Jihyong Oh, Munchurl Kim
MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos
Zhengqi Li, Richard Tucker, Forrester Cole, Qianqian Wang, Linyi Jin, Vickie Ye, Angjoo Kanazawa, Aleksander Holynski, Noah Snavely
MVUDA: Unsupervised Domain Adaptation for Multi-view Pedestrian Detection
Erik Brorsson, Lennart Svensson, Kristofer Bengtsson, Knut Åkesson
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models
Rundi Wu, Ruiqi Gao, Ben Poole, Alex Trevithick, Changxi Zheng, Jonathan T. Barron, Aleksander Holynski
HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction
Wei Zhang, Qing Cheng, David Skuddis, Niclas Zeller, Daniel Cremers, Norbert Haala