Code LLM
Code LLMs are large language models trained on massive code corpora, aiming to improve various software engineering tasks like code generation, completion, and bug detection. Current research focuses on enhancing these models by incorporating code structure (e.g., using graph neural networks), improving training data quality through careful curation and pruning, and mitigating vulnerabilities like adversarial attacks and trojan insertion. These advancements are significant because they improve the reliability and security of automatically generated code, potentially increasing developer productivity and software quality.
Papers
GCoder: Improving Large Language Model for Generalized Graph Problem Solving
Qifan Zhang, Xiaobin Hong, Jianheng Tang, Nuo Chen, Yuhan Li, Wenzhong Li, Jing Tang, Jia Li
Aligning CodeLLMs with Direct Preference Optimization
Yibo Miao, Bofei Gao, Shanghaoran Quan, Junyang Lin, Daoguang Zan, Jiaheng Liu, Jian Yang, Tianyu Liu, Zhijie Deng