Visual Residual
Visual residuals, representing the differences between predicted and observed visual data, are central to improving the accuracy and robustness of various computer vision and machine learning tasks. Current research focuses on mitigating the negative impacts of these residuals, such as hallucinations in vision-language models (addressed through techniques like residual visual decoding) and gradient vanishing in image-text matching (mitigated by selective hard negative mining). These advancements are crucial for enhancing the performance of applications ranging from visual odometry and 3D reconstruction to image enhancement and motion retargeting, ultimately leading to more reliable and accurate systems.
Papers
November 20, 2024
June 30, 2024
June 14, 2024
November 16, 2023
September 1, 2023
March 15, 2023
March 1, 2023
July 30, 2022
May 16, 2022