Paper ID: 2209.14743
Dataset Complexity Assessment Based on Cumulative Maximum Scaled Area Under Laplacian Spectrum
Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama
Dataset complexity assessment aims to predict classification performance on a dataset with complexity calculation before training a classifier, which can also be used for classifier selection and dataset reduction. The training process of deep convolutional neural networks (DCNNs) is iterative and time-consuming because of hyperparameter uncertainty and the domain shift introduced by different datasets. Hence, it is meaningful to predict classification performance by assessing the complexity of datasets effectively before training DCNN models. This paper proposes a novel method called cumulative maximum scaled Area Under Laplacian Spectrum (cmsAULS), which can achieve state-of-the-art complexity assessment performance on six datasets.
Submitted: Sep 29, 2022