Paper ID: 2212.07900

EVAL: Explainable Video Anomaly Localization

Ashish Singh, Michael J. Jones, Erik Learned-Miller

We develop a novel framework for single-scene video anomaly localization that allows for human-understandable reasons for the decisions the system makes. We first learn general representations of objects and their motions (using deep networks) and then use these representations to build a high-level, location-dependent model of any particular scene. This model can be used to detect anomalies in new videos of the same scene. Importantly, our approach is explainable - our high-level appearance and motion features can provide human-understandable reasons for why any part of a video is classified as normal or anomalous. We conduct experiments on standard video anomaly detection datasets (Street Scene, CUHK Avenue, ShanghaiTech and UCSD Ped1, Ped2) and show significant improvements over the previous state-of-the-art.

Submitted: Dec 15, 2022

Topics

Deep Network
Video Anomaly Detection
Video Anomaly
Motion Feature
General Purpose Representation
Error Vector Assisted Learning

Links

arXiv PDF