Scene Parsing
Scene parsing, the task of assigning semantic labels to each pixel in an image or video, aims to create a comprehensive understanding of visual scenes. Current research focuses on improving accuracy and robustness across diverse conditions, including nighttime scenes and videos, often employing techniques like unsupervised domain adaptation, transformer-based architectures (e.g., Vision Transformers), and multi-task learning to leverage contextual information from multiple sources (e.g., maps, audio). These advancements are crucial for applications such as autonomous driving, video understanding, and 3D scene reconstruction, enabling more sophisticated and reliable computer vision systems.
Papers
June 15, 2024
June 14, 2024
June 2, 2024
May 11, 2024
September 4, 2023
June 6, 2023
March 6, 2023
August 30, 2022
July 21, 2022
June 13, 2022
December 2, 2021