3D Scene Understanding
3D scene understanding aims to enable computers to perceive and interpret three-dimensional environments, facilitating applications in robotics, autonomous driving, and virtual reality. Current research focuses on developing robust and efficient models, often leveraging neural radiance fields, large language models (LLMs), and transformer architectures, to achieve tasks such as semantic segmentation, instance segmentation, and object pose estimation. These advancements are driven by the need for more accurate and reliable scene representations, often addressing challenges like data scarcity, class imbalance, and the need for generalization across diverse scenes. The resulting improvements in 3D scene understanding have significant implications for various fields, enabling more sophisticated interactions between humans and machines in complex environments.
Papers
Holistic Understanding of 3D Scenes as Universal Scene Description
Anna-Maria Halacheva, Yang Miao, Jan-Nico Zaech, Xi Wang, Luc Van Gool, Danda Pani Paudel
LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences
Hongyan Zhi, Peihao Chen, Junyan Li, Shuailei Ma, Xinyu Sun, Tianhang Xiang, Yinjie Lei, Mingkui Tan, Chuang Gan