Human Saliency

Human saliency research focuses on understanding and predicting where humans visually attend in images and videos, aiming to bridge the gap between human perception and machine vision. Current research emphasizes developing and improving deep learning models, particularly those based on transformer and convolutional neural network architectures, to accurately predict saliency maps and rankings, often incorporating multimodal data (RGB, depth, thermal) and leveraging techniques like contrastive learning and data augmentation. This work has implications for improving various applications, including image quality assessment, object detection, and the design of more effective user interfaces, as well as providing insights into human visual attention mechanisms and cognitive processes.

Papers