Object Level

Object-level understanding in computer vision aims to represent and reason about individual objects within scenes, moving beyond simple object detection to encompass their properties, relationships, and interactions. Current research heavily utilizes transformer-based architectures, often incorporating multi-modal learning (combining visual and textual data) and leveraging techniques like knowledge distillation and contrastive learning to improve model performance and generalization. This focus on object-centric representation is crucial for advancing applications such as autonomous driving, robotics, and image understanding, enabling more robust and context-aware systems.

Papers

December 1, 2023

Open-vocabulary object 6D pose estimation
Jaime Corsetti, Davide Boscaini, Changjae Oh, Andrea Cavallaro, Fabio Poiesi
Vision Language Model Estimation Task Object Level 6D Object Pose

November 21, 2023

Learning Part Motion of Articulated Objects Using Spatially Continuous Neural Implicit Representations
Yushi Du, Ruihai Wu, Yan Shen, Hao Dong
Motion Information Articulated Object Object Level Neuronal Network Continuous Image Representation

November 18, 2023

SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation
Yamei Chen, Yan Di, Guangyao Zhai, Fabian Manhardt, Chenyangguang Zhang, Ruida Zhang, Federico Tombari, Nassir Navab, Benjamin Busam
Category Level Object Level One 2 Consistent Representation Learning Shape Variation

October 30, 2023

Knolling bot 2.0: Enhancing Object Organization with Self-supervised Graspability Estimation
Yuhang Hu, Zhizhuo Zhang, Hod Lipson
Grasp Detection Object Level Home Robot Grasp Prediction

October 18, 2023

An Image is Worth Multiple Words: Discovering Object Level Concepts using Multi-Concept Prompt Learning
Chen Jin, Ryutaro Tanno, Amrutha Saseendran, Tom Diethe, Philip Teare
Contrastive Loss Object Level Prompt Learning Method Worth Multiple Word Sentence Image Pair Prompt LEarning

July 9, 2023

Reasoning over the Behaviour of Objects in Video-Clips for Adverb-Type Recognition
Amrit Diggavi Seshadri, Alessandra Russo
Arbitrary Object BEHAVIOR Explanation Object Level Theatre Scene Description Software Behavior Video Clip Adverb Recognition

May 24, 2023

Contrastive Training of Complex-Valued Autoencoders for Object Discovery
Aleksandar Stanić, Anand Gopalakrishnan, Kazuki Irie, Jürgen Schmidhuber
Object Centric Object Discovery Multi Object Contrastive Training Object Level ComplEx Valued

May 3, 2023

Learning-based Relational Object Matching Across Views
Cathrin Elich, Iro Armeni, Martin R. Oswald, Marc Pollefeys, Joerg Stueckler
Synthesized View Scene Understanding Object Level Object Matching

April 9, 2023

Curricular Object Manipulation in LiDAR-based Object Detection
Ziyue Zhu, Qiang Meng, Xiao Wang, Ke Wang, Liujiang Yan, Jian Yang
Augmentation Method Object Level Lidar Based 3D Object Detection LiDAR Object Detection Network

April 7, 2023

DATE: Domain Adaptive Product Seeker for E-commerce
Haoyuan Li, Hao Jiang, Tao Jin, Mengyan Li, Yan Chen, Zhijie Lin, Yang Zhao, Zhou Zhao
E Commerce Domain Adaptive Object Level Product Retrieval Design Automation

March 27, 2023

Object Discovery from Motion-Guided Tokens
Zhipeng Bao, Pavel Tokmakov, Yu-Xiong Wang, Adrien Gaidon, Martial Hebert
Vector Quantization Object Discovery Object Level Auto Encoders

March 10, 2023

Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
Luting Wang, Yi Liu, Penghui Du, Zihan Ding, Yue Liao, Qiaosong Qi, Biaolong Chen, Si Liu
Open Vocabulary Object Detection Object Level 3D Object Detection Distillation

February 3, 2023

DEVICE: DEpth and VIsual ConcEpts Aware Transformer for TextCaps
Dongsheng Xu, Qingbao Huang, Feng Shuang, Yi Cai
Image Captioning Large Depth Scene Text Object Level Depth Feature Depth Aware Transformer Text Block

January 31, 2023

Priors are Powerful: Improving a Transformer for Multi-camera 3D Detection with 2D Priors
Di Feng, Francesco Ferroni
Transformer Based Object Level SAM Prior Multi Camera 3D Object Detection Image Feature Map

December 26, 2022

Prototype-guided Cross-task Knowledge Distillation for Large-scale Models
Deng Li, Aming Wu, Yahong Han, Qi Tian
Knowledge Distillation Object Level Large Scale Model Cross Task Knowledge Distillation

December 6, 2022

Beyond Object Recognition: A New Benchmark towards Object Concept Learning
Yong-Lu Li, Yue Xu, Xinyu Xu, Xiaohan Mao, Yuan Yao, Siqi Liu, Cewu Lu
New Benchmark Affordance Learning Object Recognition Object Representation Object Level

December 1, 2022

GRiT: A Generative Region-to-text Transformer for Object Understanding
Jialian Wu, Jianfeng Wang, Zhengyuan Yang, Zhe Gan, Zicheng Liu, Junsong Yuan, Lijuan Wang
Arbitrary Object Visual Encoder Object Level Dense Caption Dense Captioning

November 21, 2022

Neural Meta-Symbolic Reasoning and Learning
Zihan Ye, Hikaru Shindo, Devendra Singh Dhami, Kristian Kersting
LeArning Abstract Neuro Symbolic Object Level Differentiable Programming Meta Reasoning Neural Symbolic Differentiable Reasoning

November 9, 2022

Understanding Cross-modal Interactions in V&L Models that Generate Scene Descriptions
Michele Cafagna, Kees van Deemter, Albert Gatt
Object Centric Representation Scene Text Cross Modal Interaction Object Level

July 12, 2022

Long-Horizon Planning and Execution with Functional Object-Oriented Networks
David Paulius, Alejandro Agostini, Dongheui Lee
Object Level Long Horizon Planning Functional Object Oriented Network