Perceptual Similarity Metric

Perceptual similarity metrics aim to quantify how similar two images appear to the human eye, going beyond simple pixel-by-pixel comparisons. Current research focuses on improving the accuracy and robustness of these metrics, exploring various deep learning architectures like Transformers and Vision Transformers (ViTs), as well as linear methods, to better align with human perception. A key challenge is developing metrics resistant to adversarial attacks and misalignments, while maintaining computational efficiency for large-scale applications. These advancements are crucial for improving image quality assessment, content retrieval, and other computer vision tasks that rely on accurate estimations of visual similarity.

Papers