3D Understanding
3D understanding focuses on enabling computers to perceive and interpret three-dimensional scenes and objects, mirroring human spatial reasoning. Current research emphasizes developing robust models that integrate multiple data modalities (point clouds, images, text, even audio) using techniques like multi-modal mixing, contrastive learning, and large language models (LLMs) to improve accuracy and efficiency. This field is crucial for advancements in robotics, autonomous driving, augmented reality, and other applications requiring sophisticated scene understanding, with recent work highlighting the importance of data efficiency and explainability in model development.
Papers
October 28, 2024
October 2, 2024
August 28, 2024
August 20, 2024
August 14, 2024
May 28, 2024
May 27, 2024
May 6, 2024
April 4, 2024
March 18, 2024
March 14, 2024
February 27, 2024
December 24, 2023
December 20, 2023
December 17, 2023
December 11, 2023
October 14, 2023
October 13, 2023