Spatial Intelligence
Spatial intelligence in artificial intelligence focuses on enabling machines to understand and reason about spatial relationships within their environment, mirroring human cognitive abilities. Current research emphasizes developing multimodal models that integrate visual, linguistic, and geometric data using techniques like deep learning frameworks optimized for large-scale 3D data processing and novel geospatial location embedding approaches within large language models. This work is significant for advancing autonomous systems (like self-driving cars), improving augmented reality experiences, and enabling more sophisticated applications in fields such as remote sensing and urban planning.
Papers
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces
Jihan Yang, Shusheng Yang, Anjali W. Gupta, Rilyn Han, Li Fei-Fei, Saining Xie
Look Inside for More: Internal Spatial Modality Perception for 3D Anomaly Detection
Hanzhe Liang, Guoyang Xie, Chengbin Hou, Bingshu Wang, Can Gao, Jinbao Wang