Saliency Map
Saliency maps are visual representations highlighting the most influential regions of an input (e.g., image, video, audio) for a model's prediction, aiming to improve the interpretability of "black box" models like deep neural networks. Current research focuses on developing more accurate and robust saliency map generation methods, often employing gradient-based techniques, transformer architectures, and diffusion models, and exploring their application across diverse data modalities (images, videos, audio, time series). These advancements are crucial for enhancing trust and understanding in AI systems, particularly in high-stakes applications like medical diagnosis and autonomous driving, by providing insights into model decision-making processes.
Papers
DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
Junwen Xiong, Peng Zhang, Tao You, Chuanyue Li, Wei Huang, Yufei Zha
Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation
Lian Xu, Mohammed Bennamoun, Farid Boussaid, Wanli Ouyang, Ferdous Sohel, Dan Xu