3D Scene Understanding
3D scene understanding aims to enable computers to perceive and interpret three-dimensional environments, facilitating applications in robotics, autonomous driving, and virtual reality. Current research focuses on developing robust and efficient models, often leveraging neural radiance fields, large language models (LLMs), and transformer architectures, to achieve tasks such as semantic segmentation, instance segmentation, and object pose estimation. These advancements are driven by the need for more accurate and reliable scene representations, often addressing challenges like data scarcity, class imbalance, and the need for generalization across diverse scenes. The resulting improvements in 3D scene understanding have significant implications for various fields, enabling more sophisticated interactions between humans and machines in complex environments.
Papers
NIS-SLAM: Neural Implicit Semantic RGB-D SLAM for 3D Consistent Scene Understanding
Hongjia Zhai, Gan Huang, Qirui Hu, Guanglin Li, Hujun Bao, Guofeng Zhang
3D-GRES: Generalized 3D Referring Expression Segmentation
Changli Wu, Yihang Liu, Jiayi Ji, Yiwei Ma, Haowei Wang, Gen Luo, Henghui Ding, Xiaoshuai Sun, Rongrong Ji