Paper ID: 2312.03804 • Published Dec 6, 2023
How Low Can You Go? Surfacing Prototypical In-Distribution Samples for Unsupervised Anomaly Detection
Felix Meissen, Johannes Getzner, Alexander Ziller, Özgün Turgut, Georgios Kaissis, Martin J. Menten, Daniel Rueckert
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Unsupervised anomaly detection (UAD) alleviates large labeling efforts by
training exclusively on unlabeled in-distribution data and detecting outliers
as anomalies. Generally, the assumption prevails that large training datasets
allow the training of higher-performing UAD models. However, in this work, we
show that UAD with extremely few training samples can already match -- and in
some cases even surpass -- the performance of training with the whole training
dataset. Building upon this finding, we propose an unsupervised method to
reliably identify prototypical samples to further boost UAD performance. We
demonstrate the utility of our method on seven different established UAD
benchmarks from computer vision, industrial defect detection, and medicine.
With just 25 selected samples, we even exceed the performance of full training
in 25/67 categories in these benchmarks. Additionally, we show that the
prototypical in-distribution samples identified by our proposed method
generalize well across models and datasets and that observing their sample
selection criteria allows for a successful manual selection of small subsets of
high-performing samples. Our code is available at
this https URL