Code Generation
Code generation research focuses on using large language models (LLMs) to automatically produce functional and secure code from natural language descriptions or other inputs. Current efforts concentrate on improving the accuracy and efficiency of code generation, including developing novel training objectives like horizon-length prediction and employing techniques such as multi-agent frameworks, Monte Carlo Tree Search, and prompt engineering to guide LLMs towards better solutions. This field is significant because it promises to dramatically increase developer productivity and accelerate software development, while also raising important questions about code security and reliability that require further investigation.
Papers
Code Representation Learning At Scale
Dejiao Zhang, Wasi Ahmad, Ming Tan, Hantian Ding, Ramesh Nallapati, Dan Roth, Xiaofei Ma, Bing Xiang
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
Shihan Dou, Yan Liu, Haoxiang Jia, Limao Xiong, Enyu Zhou, Wei Shen, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Xuanjing Huang, Tao Gui
Designing Silicon Brains using LLM: Leveraging ChatGPT for Automated Description of a Spiking Neuron Array
Michael Tomlinson, Joe Li, Andreas Andreou
Improving Natural Language Capability of Code Large Language Model
Wei Li, Daoguang Zan, Bei Guan, Ailun Yu, Xiaolin Chen, Yongji Wang
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Daya Guo, Qihao Zhu, Dejian Yang, Zhenda Xie, Kai Dong, Wentao Zhang, Guanting Chen, Xiao Bi, Y. Wu, Y. K. Li, Fuli Luo, Yingfei Xiong, Wenfeng Liang
Copilot Refinement: Addressing Code Smells in Copilot-Generated Python Code
Beiqi Zhang, Peng Liang, Qiong Feng, Yujia Fu, Zengyang Li