Paper ID: 2410.12509

Benchmarking Defeasible Reasoning with Large Language Models -- Initial Experiments and Future Directions

Ilias Tachmazidis, Sotiris Batsakis, Grigoris Antoniou

Large Language Models (LLMs) have gained prominence in the AI landscape due to their exceptional performance. Thus, it is essential to gain a better understanding of their capabilities and limitations, among others in terms of nonmonotonic reasoning. This paper proposes a benchmark that corresponds to various defeasible rule-based reasoning patterns. We modified an existing benchmark for defeasible logic reasoners by translating defeasible rules into text suitable for LLMs. We conducted preliminary experiments on nonmonotonic rule-based reasoning using ChatGPT and compared it with reasoning patterns defined by defeasible logic.

Submitted: Oct 16, 2024