Code Authorship Attribution

Code authorship attribution aims to identify the author of a piece of code based on stylistic features, a task with implications for software security and forensics. Current research focuses on improving the accuracy of attribution using ensemble learning methods like Random Forests and Gradient Boosting, often incorporating techniques to enhance model interpretability. However, a significant area of investigation involves developing robust attribution methods that are resistant to adversarial attacks designed to obfuscate authorship, with recent work exploring both defensive and offensive strategies using techniques like adversarial training and code transformation. These advancements are crucial for enhancing software security and addressing privacy concerns related to code authorship.

Papers

January 14, 2025

I Can Find You in Seconds! Leveraging Large Language Models for Code Authorship Attribution
Soohyeon Choi, Yong Kiam Tan, Mark Huasong Meng, Mohamed Ragab, Soumik Mondal, David Mohaisen, Khin Mi Mi Aung
Real World Code Authorship Attribution Authorship Analysis Code Authorship Attribution

December 6, 2024

OCEAN: Open-World Contrastive Authorship Identification
Felix Mächtle, Jan-Niclas Serr, Nils Loose, Jonas Sander, Thomas Eisenbarth
Source Code Authorship Attribution Upper Ocean Open World Semi Supervised Code Authorship Attribution

June 28, 2024

AuthAttLyzer-V2: Unveiling Code Authorship Attribution using Enhanced Ensemble Learning Models & Generating Benchmark Dataset
Bhaskar Joshi, Sepideh HajiHossein Khani, Arash HabibiLashkari
Ensemble Learning Source Code Authorship Attribution Authorship Verification Code Authorship Attribution

November 26, 2023

Untargeted Code Authorship Evasion with Seq2Seq Transformation
Soohyeon Choi, Rhongho Jang, DaeHun Nyang, David Mohaisen
Seq2seq Model Code Data Code Translation Authorship Obfuscation Code Authorship Attribution

April 26, 2023

SHIELD: Thwarting Code Authorship Attribution
Mohammed Abuhamad, Changhun Jung, David Mohaisen, DaeHun Nyang
Authorship Attribution Shield Machine Adversarial Code Code Authorship Attribution

February 12, 2022

RoPGen: Towards Robust Code Authorship Attribution via Automatic Coding Style Transformation
Zhen Li, Guenevere, Chen, Chen Chen, Yayi Zou, Shouhuai Xu
Adversarial Example Code Style Transfer Code Authorship Attribution