Paper ID: 2112.07969

Predicting Media Memorability: Comparing Visual, Textual and Auditory Features

Lorin Sweeney, Graham Healy, Alan F. Smeaton

This paper describes our approach to the Predicting Media Memorability task in MediaEval 2021, which aims to address the question of media memorability by setting the task of automatically predicting video memorability. This year we tackle the task from a comparative standpoint, looking to gain deeper insights into each of three explored modalities, and using our results from last year's submission (2020) as a point of reference. Our best performing short-term memorability model (0.132) tested on the TRECVid2019 dataset -- just like last year -- was a frame based CNN that was not trained on any TRECVid data, and our best short-term memorability model (0.524) tested on the Memento10k dataset, was a Bayesian Ride Regressor fit with DenseNet121 visual features.

Submitted: Dec 15, 2021

Topics

Text Modality
Acoustic Feature
Memorability Prediction
Video Memorability
Medium Memorability

Links

arXiv PDF