Video Panoptic Segmentation

Video panoptic segmentation (VPS) aims to comprehensively understand video scenes by simultaneously segmenting all pixels into semantic categories (e.g., "road," "car") and identifying individual objects ("car 1," "car 2") across multiple frames, maintaining consistent object tracking. Recent research heavily utilizes transformer-based architectures, often incorporating decoupled instance segmentation frameworks and query-based approaches to improve both segmentation accuracy and temporal consistency, as measured by metrics like VPQ and STQ. This challenging task is driving advancements in video understanding with significant implications for applications such as autonomous driving, video editing, and robotics, particularly through the development of robust and efficient models capable of handling diverse real-world scenarios. The field is also exploring unified approaches that handle both online and near-online segmentation, and methods that leverage depth information to improve accuracy.

Papers

December 30, 2024

LiDAR-Camera Fusion for Video Panoptic Segmentation without Video Training
Fardin Ayar, Ehsan Javanmardi, Manabu Tsukada, Mahdi Javanmardi, Mohammad Rahmati
Semantic Segmentation Training Data Autonomous Vehicle Source Video Feature Fusion Panoptic Segmentation LiDAR Camera Fusion Video Panoptic Segmentation

December 10, 2024

Balancing Shared and Task-Specific Representations: A Hybrid Approach to Depth-Aware Video Panoptic Segmentation
Kurt H.W. Stolle (Eindhoven University of Technology)
Depth Estimation Panoptic Segmentation Balancing Strategy Hybrid Approach Mask Transformer Video Panoptic Segmentation Task Induced Representation

June 8, 2024

1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation
Qingfeng Liu, Mostafa El-Khamy, Kee-Bong Song
Video Instance Segmentation Video Semantic Segmentation Consistent Video Video Panoptic Segmentation

June 6, 2024

3rd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation
Ruipu Wu, Jifei Che, Han Li, Chengjing Wu, Ting Liu, Luoqi Liu
Panoptic Segmentation Place Solution Video Sequence Test Set Video Panoptic Segmentation

June 1, 2024

2nd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation
Biao Wu, Diankai Zhang, Si Gao, Chengjian Zheng, Shaoli Liu, Ning Wang
Video Understanding Panoptic Segmentation Semantic Segmentation Model Place Solution Video Panoptic Segmentation

June 11, 2023

3rd Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation
Jinming Su, Wangwang Yang, Junfeng Luo, Xiaolin Wei
Panoptic Segmentation Place Solution Video Instance Segmentation Video Panoptic Segmentation

June 7, 2023

1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation
Tao Zhang, Xingye Tian, Haoran Wei, Yu Wu, Shunping Ji, Xuebo Wang, Xin Tao, Yuan Zhang, Pengfei Wan
Autonomous Driving Source Video Panoptic Segmentation Video Editing Place Solution Video Panoptic Segmentation

May 23, 2023

WinDB: HMD-free and Distortion-free Panoptic Video Fixation Learning
Guotao Wang, Chenglizhao Chen, Aimin Hao, Hong Qin, Deng-Ping Fan
Panoramic Video Fixation Prediction Head Mounted Display Video Panoptic Segmentation Panoptic Scene

April 10, 2023

Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation
Inkyu Shin, Dahun Kim, Qihang Yu, Jun Xie, Hong-Seok Kim, Bradley Green, In So Kweon, Kuk-Jin Yoon, Liang-Chieh Chen
Unified Framework Panoptic Segmentation Online Service Mask Transformer CLIP Level Video Panoptic Segmentation Video Clip

March 3, 2023

Unified Perception: Efficient Depth-Aware Video Panoptic Segmentation with Minimal Annotation Costs
Kurt Stolle, Gijs Dubbelman
Panoptic Segmentation Scene Understanding Annotation Budget Object Embeddings Video Panoptic Segmentation

January 6, 2023

TarViS: A Unified Approach for Target-based Video Segmentation
Ali Athar, Alexander Hermans, Jonathon Luiten, Deva Ramanan, Bastian Leibe
Unified Framework Video Object Segmentation Video Instance Segmentation Video Segmentation Video Panoptic Segmentation Exemplar Guided

October 14, 2022

MonoDVPS: A Self-Supervised Monocular Depth Estimation Approach to Depth-aware Video Panoptic Segmentation
Andra Petrovai, Sergiu Nedevschi
Monocular Depth Estimation Panoptic Segmentation Self Supervised Monocular Depth Estimation Video Panoptic Segmentation

October 7, 2022

Time-Space Transformers for Video Panoptic Segmentation
Andra Petrovai, Sergiu Nedevschi
Instance Segmentation Panoptic Segmentation Temporal Transformer Video Panoptic Segmentation

June 15, 2022

Waymo Open Dataset: Panoramic Video Panoptic Segmentation
Jieru Mei, Alex Zihao Zhu, Xinchen Yan, Hang Yan, Siyuan Qiao, Yukun Zhu, Liang-Chieh Chen, Henrik Kretzschmar, Dragomir Anguelov
Panoptic Segmentation Open Dataset Video Panoptic Segmentation

March 2, 2022

Hybrid Tracker with Pixel and Instance for Video Panoptic Segmentation
Weicai Ye, Xinyue Lan, Ge Su, Hujun Bao, Zhaopeng Cui, Guofeng Zhang
Optical Flow Panoptic Segmentation Tetromino Pixel Inter Frame Human Instance Video Panoptic Segmentation C BIoU Tracker

December 16, 2021

Slot-VPS: Object-centric Representation Learning for Video Panoptic Segmentation
Yi Zhou, Hui Zhang, Hana Lee, Shuyang Sun, Pingjun Li, Yangguang Zhu, ByungIn Yoo, Xiaojuan Qi, Jae-Joon Han
Object Centric Representation Object Centric Learning Video Panoptic Segmentation Panoptic Symbol Spotting Panoptic Perception

December 5, 2021

PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation
Haobo Yuan, Xiangtai Li, Yibo Yang, Guangliang Cheng, Jing Zhang, Yunhai Tong, Lefei Zhang, Dacheng Tao
Panoptic Segmentation Query Learning Video Panoptic Segmentation