Video Perception
Video perception research aims to enable computers to understand and interpret video content as effectively as humans do, focusing on tasks like object segmentation, action recognition, and question answering. Current efforts concentrate on integrating large language models (LLMs) with visual processing to improve contextual understanding and reasoning capabilities, often employing transformer-based architectures and novel quantization techniques for efficiency. These advancements are significant for applications ranging from automated sports analysis and autonomous driving to enhancing video quality assessment and improving the robustness of computer vision systems against adversarial attacks.
Papers
July 18, 2024
February 25, 2024
December 19, 2023
December 4, 2023
November 21, 2023
November 13, 2023
August 18, 2023
January 16, 2023
December 22, 2022