Scanpath Prediction

Scanpath prediction aims to computationally model human eye movements, predicting the sequence of fixations (gaze points) during visual exploration. Current research emphasizes improving prediction accuracy across diverse visual stimuli (images, videos, 360° environments, medical scans, GUIs) using advanced deep learning architectures like transformers and diffusion models, often incorporating multimodal data (e.g., text descriptions) and individual differences in attention patterns. This field is significant for advancing our understanding of visual attention and cognitive processes, and has practical applications in human-computer interaction, virtual/augmented reality, and personalized user experience design.

Papers