3D Awareness

3D awareness in computer vision aims to endow computational models with the ability to understand and represent the three-dimensional structure of scenes and objects from various 2D inputs, such as images and videos. Current research focuses on integrating 3D information into existing 2D models, leveraging techniques like depth estimation, multi-view geometry, and 3D-aware generative models (e.g., GANs, diffusion models, NeRFs) to improve performance on tasks such as object recognition, scene understanding, and human motion synthesis. This enhanced 3D understanding has significant implications for various applications, including robotics, augmented reality, medical image analysis, and drug discovery, by enabling more robust and accurate scene interpretation and interaction.

Papers

May 16, 2024

When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
Xianzheng Ma, Yash Bhalgat, Brandon Smart, Shuai Chen, Xinghui Li, Jian Ding, Jindong Gu, Dave Zhenyu Chen, Songyou Peng, Jia-Wang Bian, Philip H Torr, Marc Pollefeys, Matthias Nießner, Ian D Reid, Angel X. Chang, Iro Laina, Victor Adrian Prisacariu
Large Language Model Multi Modal Large Language Model 3D Scene Understanding Spatial Reasoning Meta Analysis 3D Data Spatial Understanding 3D Space 3D Awareness

April 12, 2024

Probing the 3D Awareness of Visual Foundation Models
Mohamed El Banani, Amit Raj, Kevis-Kokitsi Maninis, Abhishek Kar, Yuanzhen Li, Michael Rubinstein, Deqing Sun, Leonidas Guibas, Justin Johnson, Varun Jampani
Intermediate Representation Visual Task Visual Foundation Model 3D Structure 3D Awareness

March 17, 2024

Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields
Yonggan Fu, Huaizhi Qu, Zhifan Ye, Chaojian Li, Kevin Zhao, Yingyan Celine Lin
Neural Radiance Field Surface Reconstruction Object Centric 3D Awareness Omni Recon Generalizable 3D

January 13, 2024

Triamese-ViT: A 3D-Aware Method for Robust Brain Age Estimation from MRIs
Zhaonian Zhang, Richard Jiang
Vision Transformer Brain Age ViT Lens 3D Awareness MRI Scan Brain Aging

October 17, 2023

LiDAR-based 4D Occupancy Completion and Forecasting
Xinhao Liu, Moonjun Gong, Qi Fang, Haoyu Xie, Yiming Li, Hang Zhao, Chen Feng
Autonomous Driving Autonomous Vehicle State of the Art Forecasting 3D Awareness LiDAR Perception Task

October 12, 2023

NSM4D: Neural Scene Model Based Online 4D Point Cloud Sequence Understanding
Yuhao Dong, Zhuoyang Zhang, Yunze Liu, Li Yi
Point Cloud Sequence 3D Awareness 4 Dimensional Point Cloud 3D Backbone

October 5, 2023

3D-Aware Hypothesis & Verification for Generalizable Relative Object Pose Estimation
Chen Zhao, Tong Zhang, Mathieu Salzmann
Verification Task Relative Pose Robust Generalization 3D Shape Representation 3D Awareness Pose Hypothesis

October 2, 2023

3DHR-Co: A Collaborative Test-time Refinement Framework for In-the-Wild 3D Human-Body Reconstruction Task
Jonathan Samuel Lumentut, Kyoung Mu Lee
Test Time 3D Human Reconstruction Full Body 3D Awareness 3D Backbone

September 27, 2023

OrthoPlanes: A Novel Representation for Better 3D-Awareness of GANs
Honglin He, Zhuoqian Yang, Shikai Li, Bo Dai, Wayne Wu
GAN Model StyleGAN Latent Implicit Representation 3D Awareness Novel Representation Fine Grained 3D

March 27, 2023

3D-Aware Multi-Class Image-to-Image Translation with NeRFs
Senmao Li, Joost van de Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang
Generative Model GAN Model Image to Image 3D Awareness 3D Aware GAN

March 14, 2023

Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation
Junyoung Seo, Wooseok Jang, Min-Seop Kwak, Hyeonsu Kim, Jaehoon Ko, Junho Kim, Jin-Hwa Kim, Jiyoung Lee, Seungryong Kim
Neural Radiance Field Text to 3D Generation Score Distillation Text to Image Diffusion 3D Awareness

March 2, 2023

Grid-Centric Traffic Scenario Perception for Autonomous Driving: A Comprehensive Review
Yining Shi, Kun Jiang, Jiusi Li, Zelin Qian, Junze Wen, Mengmeng Yang, Ke Wang, Diange Yang
Autonomous Driving Autonomous Vehicle Comprehensive Review Robot Perception Object Centric Traffic Scene 3D Awareness

November 29, 2022

DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model
Gwanghyun Kim, Se Young Chun
Generative Model 3D Generative Text to Image Diffusion 3D Awareness Text Only Domain Adaptation