Video Scene Graph

Video scene graphs (VSGs) represent videos as structured graphs, capturing the temporal relationships between objects and actions within a scene. Current research focuses on improving VSG generation from various video types, including egocentric and instructional videos, often employing self-supervised learning and multi-modal approaches that leverage information from audio narration or other sources. These advancements aim to overcome limitations of existing methods, such as reliance on noisy proposals or incomplete annotations, leading to more accurate and interpretable video understanding. The resulting improvements in VSG representation have implications for various applications, including video synthesis, activity summarization, and action anticipation.

Papers

December 6, 2023

Action Scene Graphs for Long-Form Understanding of Egocentric Videos
Ivan Rodin, Antonino Furnari, Kyle Min, Subarna Tripathi, Giovanni Maria Farinella
Egocentric Video Long Form Answer Egocentric Video Understanding Egocentric Action Video Scene Graph

November 11, 2022

SSGVS: Semantic Scene Graph-to-Video Synthesis
Yuren Cong, Jinhui Yi, Bodo Rosenhahn, Michael Ying Yang
Semantic Scene Graph Video Scene Graph

July 16, 2022

SVGraph: Learning Semantic Graphs from Instructional Videos
Madeline C. Schiappa, Yogesh S. Rawat
Video Understanding Semantic Graph Instructional Video Graphical Representation Video Scene Graph

December 8, 2021

Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs
Kaifeng Gao, Long Chen, Yulei Niu, Jian Shao, Jun Xiao
Ground Truth VidSGG Datasets Time Temporal Graph Video Scene Graph

Video Scene Graph

Papers

Action Scene Graphs for Long-Form Understanding of Egocentric Videos

SSGVS: Semantic Scene Graph-to-Video Synthesis

SVGraph: Learning Semantic Graphs from Instructional Videos

Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs