Paper ID: 2401.06546

Optimizing Feature Selection for Binary Classification with Noisy Labels: A Genetic Algorithm Approach

Vandad Imani, Elaheh Moradi, Carlos Sevilla-Salcedo, Vittorio Fortino, Jussi Tohka

Feature selection in noisy label scenarios remains an understudied topic. We propose a novel genetic algorithm-based approach, the Noise-Aware Multi-Objective Feature Selection Genetic Algorithm (NMFS-GA), for selecting optimal feature subsets in binary classification with noisy labels. NMFS-GA offers a unified framework for selecting feature subsets that are both accurate and interpretable. We evaluate NMFS-GA on synthetic datasets with label noise, a Breast Cancer dataset enriched with noisy features, and a real-world ADNI dataset for dementia conversion prediction. Our results indicate that NMFS-GA can effectively select feature subsets that improve the accuracy and interpretability of binary classifiers in scenarios with noisy labels.

Submitted: Jan 12, 2024