Paper ID: 2410.13465

Object Pose Estimation Using Implicit Representation For Transparent Objects

Varun Burde, Artem Moroz, Vit Zeman, Pavel Burget

Object pose estimation is a prominent task in computer vision. The object pose gives the orientation and translation of the object in real-world space, which allows various applications such as manipulation, augmented reality, etc. Various objects exhibit different properties with light, such as reflections, absorption, etc. This makes it challenging to understand the object's structure in RGB and depth channels. Recent research has been moving toward learning-based methods, which provide a more flexible and generalizable approach to object pose estimation utilizing deep learning. One such approach is the render-and-compare method, which renders the object from multiple views and compares it against the given 2D image, which often requires an object representation in the form of a CAD model. We reason that the synthetic texture of the CAD model may not be ideal for rendering and comparing operations. We showed that if the object is represented as an implicit (neural) representation in the form of Neural Radiance Field (NeRF), it exhibits a more realistic rendering of the actual scene and retains the crucial spatial features, which makes the comparison more versatile. We evaluated our NeRF implementation of the render-and-compare method on transparent datasets and found that it surpassed the current state-of-the-art results.

Submitted: Oct 17, 2024