Visual Attention
Visual attention research investigates how humans and animals selectively process visual information, aiming to understand the mechanisms underlying this crucial cognitive function and replicate it computationally. Current research focuses on developing models that integrate multiple sensory modalities (audio-visual), leverage object-level attention rather than pixel-level, and incorporate human gaze data for improved accuracy and interpretability, often employing transformer networks, spiking neural networks, and other deep learning architectures. These advancements have implications for various fields, including computer vision, human-computer interaction, and medical image analysis, by enabling more efficient and robust systems for tasks such as object tracking, speech recognition, and medical diagnosis.
Papers
Multi-task UNet: Jointly Boosting Saliency Prediction and Disease Classification on Chest X-ray Images
Hongzhi Zhu, Robert Rohling, Septimiu Salcudean
Gaze-Guided Class Activation Mapping: Leveraging Human Attention for Network Attention in Chest X-rays Classification
Hongzhi Zhu, Septimiu Salcudean, Robert Rohling