Paper ID: 2210.10824

Supervised Contrastive Learning with Tree-Structured Parzen Estimator Bayesian Optimization for Imbalanced Tabular Data

Shuting Tao, Peng Peng, Qi Li, Hongwei Wang

Class imbalance has a detrimental effect on the predictive performance of most supervised learning algorithms as the imbalanced distribution can lead to a bias preferring the majority class. To solve this problem, we propose a Supervised Contrastive Learning (SCL) method with Tree-structured Parzen Estimator (TPE) technique for imbalanced tabular datasets. Contrastive learning (CL) can extract the information hidden in data even without labels and has shown some potential for imbalanced learning tasks. SCL further considers the label information based on CL, which also addresses the insufficient data augmentation techniques of tabular data. Therefore, in this work, we propose to use SCL to learn a discriminative representation of imbalanced tabular data. Additionally, the hyper-parameter temperature of SCL has a decisive influence on the performance and is difficult to tune. We introduce TPE, a well-known Bayesian optimization technique, to automatically select the best temperature. Experiments are conducted on both binary and multi-class imbalanced tabular datasets. As shown in the results obtained, TPE outperforms three other hyper-parameter optimization (HPO) methods such as grid search, random search, and genetic algorithm. More importantly, the proposed SCL-TPE method achieves much-improved performance compared with the state-of-the-art methods.

Submitted: Oct 19, 2022