Data Discretization
Data discretization, the process of transforming continuous data into discrete categories, is crucial for various applications, aiming to improve model interpretability, computational efficiency, and robustness to noise. Current research focuses on developing novel discretization methods tailored to specific tasks, such as those employing neural networks for solving partial differential equations or designing self-interpretable models for treatment effect estimation. These advancements are impacting diverse fields, including fluid dynamics simulation, speech synthesis, and machine learning model explainability, by enhancing accuracy, efficiency, and the ability to handle complex data structures.
Papers
Interpretable classifiers for tabular data via discretization and feature selection
Reijo Jaakkola, Tomi Janhunen, Antti Kuusisto, Masood Feyzbakhsh Rankooh, Miikka Vilander
Tighter Generalization Bounds on Digital Computers via Discrete Optimal Transport
Anastasis Kratsios, A. Martina Neuman, Gudmund Pammer