3D Awareness
3D awareness in computer vision aims to endow computational models with the ability to understand and represent the three-dimensional structure of scenes and objects from various 2D inputs, such as images and videos. Current research focuses on integrating 3D information into existing 2D models, leveraging techniques like depth estimation, multi-view geometry, and 3D-aware generative models (e.g., GANs, diffusion models, NeRFs) to improve performance on tasks such as object recognition, scene understanding, and human motion synthesis. This enhanced 3D understanding has significant implications for various applications, including robotics, augmented reality, medical image analysis, and drug discovery, by enabling more robust and accurate scene interpretation and interaction.
Papers
ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
Wufei Ma, Guanning Zeng, Guofeng Zhang, Qihao Liu, Letian Zhang, Adam Kortylewski, Yaoyao Liu, Alan Yuille
ConsistDreamer: 3D-Consistent 2D Diffusion for High-Fidelity Scene Editing
Jun-Kun Chen, Samuel Rota Bulò, Norman Müller, Lorenzo Porzi, Peter Kontschieder, Yu-Xiong Wang
Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion
Linzhan Mou, Jun-Kun Chen, Yu-Xiong Wang
Structure-based Drug Design Benchmark: Do 3D Methods Really Dominate?
Kangyu Zheng, Yingzhou Lu, Zaixi Zhang, Zhongwei Wan, Yao Ma, Marinka Zitnik, Tianfan Fu
OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding
Yanmin Wu, Jiarui Meng, Haijie Li, Chenming Wu, Yahao Shi, Xinhua Cheng, Chen Zhao, Haocheng Feng, Errui Ding, Jingdong Wang, Jian Zhang