Paper ID: 2112.01880

Bayes in Wonderland! Predictive supervised classification inference hits unpredictability

Ali Amiryousefi, Ville Kinnula, Jing Tang

The marginal Bayesian predictive classifiers (mBpc) as opposed to the simultaneous Bayesian predictive classifiers (sBpc), handle each data separately and hence tacitly assumes the independence of the observations. However, due to saturation in learning of generative model parameters, the adverse effect of this false assumption on the accuracy of mBpc tends to wear out in face of increasing amount of training data; guaranteeing the convergence of these two classifiers under de Finetti type of exchangeability. This result however, is far from trivial for the sequences generated under Partition exchangeability (PE), where even umpteen amount of training data is not ruling out the possibility of an unobserved outcome (Wonderland!). We provide a computational scheme that allows the generation of the sequences under PE. Based on that, with controlled increase of the training data, we show the convergence of the sBpc and mBpc. This underlies the use of simpler yet computationally more efficient marginal classifiers instead of simultaneous. We also provide a parameter estimation of the generative model giving rise to the partition exchangeable sequence as well as a testing paradigm for the equality of this parameter across different samples. The package for Bayesian predictive supervised classifications, parameter estimation and hypothesis testing of the Ewens Sampling formula generative model is deposited on CRAN as PEkit package and free available from https://github.com/AmiryousefiLab/PEkit.

Submitted: Dec 3, 2021