Program Synthesis Benchmark
Program synthesis benchmarks evaluate the ability of artificial intelligence models to generate code from natural language descriptions or other inputs. Current research focuses on developing more robust and comprehensive benchmarks that assess various aspects of code generation, including generalization capabilities, handling of different programming paradigms, and the ability to synthesize code for complex tasks. These benchmarks are crucial for evaluating the progress of various program synthesis approaches, such as large language models, genetic programming, and multi-agent systems, and for guiding the development of more effective and efficient code generation techniques with implications for software engineering and automation.
Papers
November 13, 2024
September 2, 2024
June 17, 2024
May 18, 2024
March 28, 2024
October 30, 2023
April 20, 2023
January 4, 2023
April 12, 2022
November 15, 2021