Category Level
Category-level object pose estimation aims to determine the 3D position and orientation of objects belonging to a known category, even if the specific object instance is unseen during training. Current research heavily focuses on developing robust methods using RGB or RGB-D images, often employing deep learning architectures like transformers and diffusion models, along with techniques such as iterative closest point (ICP) algorithms and multi-view alignment. This field is significant because accurate pose estimation is crucial for various applications, including robotics (e.g., grasping, manipulation), augmented reality, and 3D scene understanding, driving the development of novel datasets and evaluation metrics to benchmark progress.
Papers
Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation
Jingtao Sun, Yaonan Wang, Mingtao Feng, Chao Ding, Mike Zheng Shou, Ajmal Saeed Mian
OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation
Junhao Cai, Yisheng He, Weihao Yuan, Siyu Zhu, Zilong Dong, Liefeng Bo, Qifeng Chen