Functional Correctness

Functional correctness, the ability of code to perform its intended function, is a central concern in evaluating code generation models, particularly large language models (LLMs). Current research emphasizes not only achieving functional correctness but also improving code efficiency, diversity, and adherence to coding style guidelines, often employing graph-matching networks or embedding-based methods for evaluation. These efforts aim to create more robust and reliable code generation systems, impacting software development by potentially automating parts of the coding process and improving code quality beyond simple functionality.

Papers