Perception Aware
Perception-aware systems aim to build artificial intelligence that understands and interacts with the world in ways similar to humans, focusing on robust and accurate perception across various modalities (vision, language, etc.). Current research emphasizes improving the accuracy and efficiency of perception models, particularly through advancements in vision-language models (VLMs) and the development of novel algorithms for dynamic resolution processing, multimodal data fusion, and uncertainty quantification. This field is crucial for advancing robotics, autonomous driving, and human-computer interaction, with applications ranging from improved object recognition and scene understanding to more natural and intuitive interfaces.
Papers
PP-SSL : Priority-Perception Self-Supervised Learning for Fine-Grained Recognition
ShuaiHeng Li, Qing Cai, Fan Zhang, Menghuan Zhang, Yangyang Shu, Zhi Liu, Huafeng Li, Lingqiao Liu
Perception of Visual Content: Differences Between Humans and Foundation Models
Nardiena A. Pratama, Shaoyang Fan, Gianluca Demartini
One to rule them all: natural language to bind communication, perception and action
Simone Colombani, Dimitri Ognibene, Giuseppe Boccignone
Time is on my sight: scene graph filtering for dynamic environment perception in an LLM-driven robot
Simone Colombani, Luca Brini, Dimitri Ognibene, Giuseppe Boccignone
Evaluating and Advancing Multimodal Large Language Models in Ability Lens
Feng Chen, Chenhui Gou, Jing Liu, Yang Yang, Zhaoyang Li, Jiyuan Zhang, Zhenbang Sun, Bohan Zhuang, Qi Wu