Paper ID: 2205.04032
Interpretable Machine Learning for Self-Service High-Risk Decision-Making
Charles Recaido, Boris Kovalerchuk
This paper contributes to interpretable machine learning via visual knowledge discovery in general line coordinates (GLC). The concepts of hyperblocks as interpretable dataset units and general line coordinates are combined to create a visual self-service machine learning model. The DSC1 and DSC2 lossless multidimensional coordinate systems are proposed. DSC1 and DSC2 can map multiple dataset attributes to a single two-dimensional (X, Y) Cartesian plane using a graph construction algorithm. The hyperblock analysis was used to determine visually appealing dataset attribute orders and to reduce line occlusion. It is shown that hyperblocks can generalize decision tree rules and a series of DSC1 or DSC2 plots can visualize a decision tree. The DSC1 and DSC2 plots were tested on benchmark datasets from the UCI ML repository. They allowed for visual classification of data. Additionally, areas of hyperblock impurity were discovered and used to establish dataset splits that highlight the upper estimate of worst-case model accuracy to guide model selection for high-risk decision-making. Major benefits of DSC1 and DSC2 is their highly interpretable nature. They allow domain experts to control or establish new machine learning models through visual pattern discovery.
Submitted: May 9, 2022