Video Object Segmentation
Video object segmentation (VOS) aims to automatically track and segment objects throughout a video sequence, given an initial annotation. Current research heavily focuses on improving accuracy and efficiency, particularly for long videos and complex scenes, employing transformer-based architectures, memory-augmented models, and techniques like visual prompting and multi-modal fusion to enhance performance. These advancements are crucial for applications ranging from video editing and autonomous driving to more specialized areas like animal behavior analysis and medical image processing, driving progress in both computer vision and related fields.
Papers
Global Motion Understanding in Large-Scale Video Object Segmentation
Volodymyr Fedynyak, Yaroslav Romanus, Oles Dobosevych, Igor Babin, Roman Riazantsev
DeVOS: Flow-Guided Deformable Transformer for Video Object Segmentation
Volodymyr Fedynyak, Yaroslav Romanus, Bohdan Hlovatskyi, Bohdan Sydor, Oles Dobosevych, Igor Babin, Roman Riazantsev