Visual Perception
Visual perception research focuses on understanding how humans and artificial systems interpret visual information, aiming to bridge the gap between raw sensory input and high-level cognitive understanding. Current research emphasizes evaluating large vision-language models (LVLMs) across multiple levels of perception, from low-level feature extraction to complex semantic reasoning, using benchmarks that assess both accuracy and the presence of hallucinations or biases. These efforts are crucial for improving the reliability and robustness of AI systems in various applications, from autonomous driving to assistive technologies for visually impaired individuals, and for advancing our understanding of human visual cognition.
Papers
ControlLoc: Physical-World Hijacking Attack on Visual Perception in Autonomous Driving
Chen Ma, Ningfei Wang, Zhengyu Zhao, Qian Wang, Qi Alfred Chen, Chao Shen
SlowPerception: Physical-World Latency Attack against Visual Perception in Autonomous Driving
Chen Ma, Ningfei Wang, Zhengyu Zhao, Qi Alfred Chen, Chao Shen
Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs
Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Utkarsh Tyagi, Oriol Nieto, Zeyu Jin, Dinesh Manocha
Brain3D: Generating 3D Objects from fMRI
Yuankun Yang, Li Zhang, Ziyang Xie, Zhiyuan Yuan, Jianfeng Feng, Xiatian Zhu, Yu-Gang Jiang
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception
Run Luo, Yunshui Li, Longze Chen, Wanwei He, Ting-En Lin, Ziqiang Liu, Lei Zhang, Zikai Song, Xiaobo Xia, Tongliang Liu, Min Yang, Binyuan Hui