CRASS: A Novel Data Set and Benchmark to Test Counterfactual Reasoning of Large Language Models [2112.11941]