Paper ID: 2306.07559

Marking anything: application of point cloud in extracting video target features

Xiangchun Xu

Extracting retrievable features from video is of great significance for structured video database construction, video copyright protection and fake video rumor refutation. Inspired by point cloud data processing, this paper proposes a method for marking anything (MA) in the video, which can extract the contour features of any target in the video and convert it into a feature vector with a length of 256 that can be retrieved. The algorithm uses YOLO-v8 algorithm, multi-object tracking algorithm and PointNet++ to extract contour of the video detection target to form spatial point cloud data. Then extract the point cloud feature vector and use it as the retrievable feature of the video detection target. In order to verify the effectiveness and robustness of contour feature, some datasets are crawled from Dou Yin and Kinetics-700 dataset as experimental data. For Dou Yin's homogenized videos, the proposed contour features achieve retrieval accuracy higher than 97% in Top1 return mode. For videos from Kinetics 700, the contour feature also showed good robustness for partial clip mode video tracing.

Submitted: Jun 13, 2023