StructTest: Benchmarking LLMs' Reasoning through Compositional Structured Outputs [2412.18011]