Real World Code
Real-world code research focuses on bridging the gap between large language models (LLMs) and practical software development, aiming to improve the quality, security, and efficiency of automatically generated code. Current research emphasizes developing methods for generating equivalent code representations, ensuring code correctness through techniques like hierarchical debugging and polyhedral modeling, and mitigating security vulnerabilities via prompt optimization and generative adversarial networks. This field is significant because it directly impacts software engineering practices, potentially increasing developer productivity and improving software reliability and security.
Papers
CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models
Dan Shi, Chaobin You, Jiantao Huang, Taihao Li, Deyi Xiong
CodeLL: A Lifelong Learning Dataset to Support the Co-Evolution of Data and Language Models of Code
Martin Weyssow, Claudio Di Sipio, Davide Di Ruscio, Houari Sahraoui