3D Referring Expression

3D Referring Expression (3D-RE) research focuses on enabling computers to understand and locate objects in 3D scenes based on natural language descriptions, bridging the gap between language and 3D perception. Current efforts concentrate on developing robust models that handle multiple objects simultaneously and integrate information from different modalities, such as vision and radar, often employing architectures like transformer networks and incorporating superpoint representations for efficient processing. This field is crucial for advancing robotics, autonomous driving, and augmented reality applications by enabling more natural and intuitive human-computer interaction within complex 3D environments.

Papers