Paper ID: 2111.04033

Positivity Validation Detection and Explainability via Zero Fraction Multi-Hypothesis Testing and Asymmetrically Pruned Decision Trees

Guy Wolf, Gil Shabat, Hanan Shteingart

Positivity is one of the three conditions for causal inference from observational data. The standard way to validate positivity is to analyze the distribution of propensity. However, to democratize the ability to do causal inference by non-experts, it is required to design an algorithm to (i) test positivity and (ii) explain where in the covariate space positivity is lacking. The latter could be used to either suggest the limitation of further causal analysis and/or encourage experimentation where positivity is violated. The contribution of this paper is first present the problem of automatic positivity analysis and secondly to propose an algorithm based on a two steps process. The first step, models the propensity condition on the covariates and then analyze the latter distribution using multiple hypothesis testing to create positivity violation labels. The second step uses asymmetrically pruned decision trees for explainability. The latter is further converted into readable text a non-expert can understand. We demonstrate our method on a proprietary data-set of a large software enterprise.

Submitted: Nov 7, 2021