Movie Understanding

Movie understanding research aims to enable computers to comprehend the complex narratives and visual information within films, going beyond simple scene recognition. Current efforts focus on developing large vision-language models (LVLMs) and multimodal architectures, often employing techniques like contrastive learning and hierarchical representation learning, to improve tasks such as character identification across scenes, narrative reasoning, and emotion recognition. These advancements are crucial for applications like generating accurate movie summaries for visually impaired individuals, analyzing film adaptations, and creating more sophisticated AI systems capable of understanding complex visual narratives. The development of new benchmarks and datasets is also driving progress in this field.

Papers

July 10, 2024

IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model
Yatai Ji, Shilong Zhang, Jie Wu, Peize Sun, Weifeng Chen, Xuefeng Xiao, Sidi Yang, Yujiu Yang, Ping Luo
Large Vision Language Model Visual Complexity Movie Understanding

June 16, 2024

Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies
Hung-Ting Su, Chun-Tong Chao, Ya-Ching Hsu, Xudong Lin, Yulei Niu, Hung-Yi Lee, Winston H. Hsu
Movie Review Compositional Reasoning Video Reasoning Movie Understanding

November 7, 2023

Analyzing Film Adaptation through Narrative Alignment
Tanzir Pial, Shahreen Salim, Charuta Pethe, Allen Kim, Steven Skiena
Adaptation Concern Text Similarity Narrative Alignment Movie Understanding

August 18, 2023

Long-range Multimodal Pretraining for Movie Understanding
Dawit Mureja Argaw, Joon-Young Lee, Markus Woodson, In So Kweon, Fabian Caba Heilbron
Cross Modal Video Understanding Multimodal Model Multimodal Pre Movie Understanding

June 4, 2023

MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning
Jianghui Wang, Yuxuan Wang, Dongyan Zhao, Zilong Zheng
Video Understanding Multimodal Machine Learning Movie Understanding

May 20, 2023

Movie101: A New Movie Understanding Benchmark
Zihao Yue, Qi Zhang, Anwen Hu, Liang Zhang, Ziheng Wang, Qin Jin
Temporal Grounding Video Benchmark Movie Dataset Video Understanding Benchmark Movie Understanding

April 12, 2023

How you feelin'? Learning Emotions and Mental States in Movie Scenes
Dhruv Srivastava, Aditya Kumar Singh, Makarand Tapaswi
Emotion Recognition Experienced Emotion Multimodal Transformer Mental State Emotion Understanding Movie Understanding

April 6, 2022

Hierarchical Self-supervised Representation Learning for Movie Understanding
Fanyi Xiao, Kaustav Kundu, Joseph Tighe, Davide Modolo
Video Understanding Self Supervised Pre Training Self Supervised Video Representation Self Supervised Video Movie Understanding