Paper ID: 2407.06115

Infer Induced Sentiment of Comment Response to Video: A New Task, Dataset and Baseline

Qi Jia, Baoyu Fan, Cong Xu, Lu Liu, Liang Jin, Guoguang Du, Zhenhua Guo, Yaqian Zhao, Xuanjing Huang, Rengang Li

Existing video multi-modal sentiment analysis mainly focuses on the sentiment expression of people within the video, yet often neglects the induced sentiment of viewers while watching the videos. Induced sentiment of viewers is essential for inferring the public response to videos, has broad application in analyzing public societal sentiment, effectiveness of advertising and other areas. The micro videos and the related comments provide a rich application scenario for viewers induced sentiment analysis. In light of this, we introduces a novel research task, Multi-modal Sentiment Analysis for Comment Response of Video Induced(MSA-CRVI), aims to inferring opinions and emotions according to the comments response to micro video. Meanwhile, we manually annotate a dataset named Comment Sentiment toward to Micro Video (CSMV) to support this research. It is the largest video multi-modal sentiment dataset in terms of scale and video duration to our knowledge, containing 107,267 comments and 8,210 micro videos with a video duration of 68.83 hours. To infer the induced sentiment of comment should leverage the video content, so we propose the Video Content-aware Comment Sentiment Analysis (VC-CSA) method as baseline to address the challenges inherent in this new task. Extensive experiments demonstrate that our method is showing significant improvements over other established baselines.

Submitted: May 15, 2024