Multi View Diffusion
Multi-view diffusion models are revolutionizing 3D content generation by synthesizing multiple consistent views of an object from limited input (e.g., a single image or text prompt), enabling subsequent high-fidelity 3D reconstruction. Current research emphasizes improving the quality, consistency, and efficiency of these models, often employing architectures like transformers and diffusion models, sometimes coupled with Gaussian splatting for efficient 3D representation. This work has significant implications for various fields, including computer graphics, virtual and augmented reality, and digital asset creation, by offering faster and more realistic 3D content generation from diverse input sources.
Papers
Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning
Desai Xie, Jiahao Li, Hao Tan, Xin Sun, Zhixin Shu, Yi Zhou, Sai Bi, Sören Pirk, Arie E. Kaufman
Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models
Huan Ling, Seung Wook Kim, Antonio Torralba, Sanja Fidler, Karsten Kreis