Paper ID: 2209.07575

A Nested Genetic Algorithm for Explaining Classification Data Sets with Decision Rules

Paul-Amaury Matt, Rosina Ziegler, Danilo Brajovic, Marco Roth, Marco F. Huber

Our goal in this paper is to automatically extract a set of decision rules (rule set) that best explains a classification data set. First, a large set of decision rules is extracted from a set of decision trees trained on the data set. The rule set should be concise, accurate, have a maximum coverage and minimum number of inconsistencies. This problem can be formalized as a modified version of the weighted budgeted maximum coverage problem, known to be NP-hard. To solve the combinatorial optimization problem efficiently, we introduce a nested genetic algorithm which we then use to derive explanations for ten public data sets.

Submitted: Aug 23, 2022