Arbitrary Object
Arbitrary object processing in computer vision aims to develop algorithms capable of understanding, manipulating, and reasoning about objects of any type, regardless of prior knowledge or training data. Current research focuses on developing robust models, often leveraging transformer architectures and diffusion models, to achieve accurate object detection, segmentation, pose estimation, and manipulation in diverse and complex scenes, including those with occlusions and interactions between multiple objects. These advancements are crucial for progress in robotics, autonomous systems, and augmented/virtual reality applications, enabling more flexible and adaptable interactions with the physical world. Furthermore, the development of efficient and generalizable methods for arbitrary object processing is driving innovation in self-supervised learning and knowledge distillation techniques.
Papers
Encoding Surgical Videos as Latent Spatiotemporal Graphs for Object and Anatomy-Driven Reasoning
Aditya Murali, Deepak Alapatt, Pietro Mascagni, Armine Vardazaryan, Alain Garcia, Nariaki Okamoto, Didier Mutter, Nicolas Padoy
Learning Polynomial Representations of Physical Objects with Application to Certifying Correct Packing Configurations
Morgan Jones
Fine-grained Controllable Video Generation via Object Appearance and Context
Hsin-Ping Huang, Yu-Chuan Su, Deqing Sun, Lu Jiang, Xuhui Jia, Yukun Zhu, Ming-Hsuan Yang
SAM-Assisted Remote Sensing Imagery Semantic Segmentation with Object and Boundary Constraints
Xianping Ma, Qianqian Wu, Xingyu Zhao, Xiaokang Zhang, Man-On Pun, Bo Huang