Code Clone Detection

Code clone detection aims to identify duplicated or highly similar code segments within or across software projects, improving software maintainability and quality. Current research emphasizes improving the accuracy and efficiency of detection, particularly for semantically similar clones (which share functionality but differ in syntax), using techniques like contrastive learning, graph-based models, and large language models (LLMs). These advancements are crucial for addressing challenges in software development, including copyright infringement detection, understanding code evolution in large frameworks, and enhancing the reliability of AI-assisted coding tools.

Papers