Paper ID: 2410.14462

LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes

Juliette Marrie, Romain Ménégaux, Michael Arbel, Diane Larlus, Julien Mairal

We address the task of uplifting visual features or semantic masks from 2D vision models to 3D scenes represented by Gaussian Splatting. Whereas common approaches rely on iterative optimization-based procedures, we show that a simple yet effective aggregation technique yields excellent results. Applied to semantic masks from Segment Anything (SAM), our uplifting approach leads to segmentation quality comparable to the state of the art. We then extend this method to generic DINOv2 features, integrating 3D scene geometry through graph diffusion, and achieve competitive segmentation results despite DINOv2 not being trained on millions of annotated masks like SAM.

Submitted: Oct 18, 2024

Topics

Vision Model
RSD Difference of Gaussian
Segmentation Result
Semantic Mask
Segmentation Quality
Mask Annotation
Theatre Scene Description

Links

arXiv PDF