Paper ID: 2403.19883
Policy-Space Search: Equivalences, Improvements, and Compression
Frederico Messa, André Grahl Pereira
Fully-observable non-deterministic (FOND) planning is at the core of artificial intelligence planning with uncertainty. It models uncertainty through actions with non-deterministic effects. A* with Non-Determinism (AND*) (Messa and Pereira, 2023) is a FOND planner that generalizes A* (Hart et al., 1968) for FOND planning. It searches for a solution policy by performing an explicit heuristic search on the policy space of the FOND task. In this paper, we study and improve the performance of the policy-space search performed by AND*. We present a polynomial-time procedure that constructs a solution policy given just the set of states that should be mapped. This procedure, together with a better understanding of the structure of FOND policies, allows us to present three concepts of equivalences between policies. We use policy equivalences to prune part of the policy search space, making AND* substantially more effective in solving FOND tasks. We also study the impact of taking into account structural state-space symmetries to strengthen the detection of equivalence policies and the impact of performing the search with satisficing techniques. We apply a recent technique from the group theory literature to better compute structural state-space symmetries. Finally, we present a solution compressor that, given a policy defined over complete states, finds a policy that unambiguously represents it using the minimum number of partial states. AND* with the introduced techniques generates, on average, two orders of magnitude fewer policies to solve FOND tasks. These techniques allow explicit policy-space search to be competitive in terms of both coverage and solution compactness with other state-of-the-art FOND planners.
Submitted: Mar 28, 2024