Reinforcement Learning for Causal Discovery without Acyclicity Constraints [2408.13448]