Paper ID: 2212.00751

P(Expression|Grammar): Probability of deriving an algebraic expression with a probabilistic context-free grammar

Urh Primožič, Ljupčo Todorovski, Matej Petković

Probabilistic context-free grammars have a long-term record of use as generative models in machine learning and symbolic regression. When used for symbolic regression, they generate algebraic expressions. We define the latter as equivalence classes of strings derived by grammar and address the problem of calculating the probability of deriving a given expression with a given grammar. We show that the problem is undecidable in general. We then present specific grammars for generating linear, polynomial, and rational expressions, where algorithms for calculating the probability of a given expression exist. For those grammars, we design algorithms for calculating the exact probability and efficient approximation with arbitrary precision.

Submitted: Dec 1, 2022