Paper ID: 2410.04485

Exploring the Potential of Conversational Test Suite Based Program Repair on SWE-bench

Anton Cheshkov, Pavel Zadorozhny, Rodion Levichev, Evgeny Maslov, Ronaldo Franco Jaldin

Automatic program repair at project level may open yet to be seen opportunities in various fields of human activity. Since the SWE-Bench challenge was presented, we have seen numerous of solutions. Patch generation is a part of program repair, and test suite-based conversational patch generation has proven its effectiveness. However, the potential of conversational patch generation has not yet specifically estimated on SWE-Bench. This study reports experimental results aimed at evaluating the individual effectiveness of conversational patch generation on problems from SWE-Bench. The experiments show that a simple conversational pipeline based on LLaMA 3.1 70B can generate valid patches in 47\% of cases, which is comparable to the state-of-the-art in program repair on SWE-Bench.

Submitted: Oct 6, 2024