Paper ID: 2412.03526 • Published Dec 4, 2024
Feed-Forward Bullet-Time Reconstruction of Dynamic Scenes from Monocular Videos
Hanxue Liang, Jiawei Ren, Ashkan Mirzaei, Antonio Torralba, Ziwei Liu, Igor Gilitschenski, Sanja Fidler, Cengiz Oztireli, Huan...
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Recent advancements in static feed-forward scene reconstruction have
demonstrated significant progress in high-quality novel view synthesis.
However, these models often struggle with generalizability across diverse
environments and fail to effectively handle dynamic content. We present BTimer
(short for BulletTimer), the first motion-aware feed-forward model for
real-time reconstruction and novel view synthesis of dynamic scenes. Our
approach reconstructs the full scene in a 3D Gaussian Splatting representation
at a given target ('bullet') timestamp by aggregating information from all the
context frames. Such a formulation allows BTimer to gain scalability and
generalization by leveraging both static and dynamic scene datasets. Given a
casual monocular dynamic video, BTimer reconstructs a bullet-time scene within
150ms while reaching state-of-the-art performance on both static and dynamic
scene datasets, even compared with optimization-based approaches.