Data Science Code Generation
Data science code generation focuses on automatically creating executable code from natural language descriptions of data analysis tasks, aiming to accelerate the data science workflow. Current research emphasizes improving the accuracy and reliability of code generated by large language models (LLMs), particularly addressing issues like hallucinations and inaccuracies through techniques such as iterative self-correction and instruction fine-tuning guided by input-output specifications. This field is significant because it has the potential to dramatically increase data scientists' productivity by automating tedious coding tasks and enabling faster exploration of data.
Papers
October 9, 2024
August 28, 2024
March 29, 2024
February 12, 2024
November 18, 2022