Code Authorship Attribution

Code authorship attribution aims to identify the author of a piece of code based on stylistic features, a task with implications for software security and forensics. Current research focuses on improving the accuracy of attribution using ensemble learning methods like Random Forests and Gradient Boosting, often incorporating techniques to enhance model interpretability. However, a significant area of investigation involves developing robust attribution methods that are resistant to adversarial attacks designed to obfuscate authorship, with recent work exploring both defensive and offensive strategies using techniques like adversarial training and code transformation. These advancements are crucial for enhancing software security and addressing privacy concerns related to code authorship.

Papers