Paper ID: 2412.02126
Benchmarking symbolic regression constant optimization schemes
L.G.A dos Reis, V.L.P.S. Caminha, T.J.P.Penna
Symbolic regression is a machine learning technique, and it has seen many advancements in recent years, especially in genetic programming approaches (GPSR). Furthermore, it has been known for many years that constant optimization of parameters, during the evolutionary search, greatly increases GPSR performance However, different authors approach such tasks differently and no consensus exists regarding which methods perform best. In this work, we evaluate eight different parameter optimization methods, applied during evolutionary search, over ten known benchmark problems, in two different scenarios. We also propose using an under-explored metric called Tree Edit Distance (TED), aiming to identify symbolic accuracy. In conjunction with classical error measures, we develop a combined analysis of model performance in symbolic regression. We then show that different constant optimization methods perform better in certain scenarios and that there is no overall best choice for every problem. Finally, we discuss how common metric decisions may be biased and appear to generate better models in comparison.
Submitted: Dec 3, 2024