Code Similarity

Code similarity assessment focuses on quantifying the resemblance between different code snippets, crucial for tasks like plagiarism detection, code clone identification, and software maintenance. Current research emphasizes developing robust metrics that capture semantic similarity beyond superficial textual matches, often leveraging large language models (LLMs) and techniques like abstract syntax tree (AST) comparison and graph neural networks to analyze code structure and functionality. These advancements are vital for improving software quality, enhancing code comprehension, and addressing the challenges posed by the increasing use of AI-generated code.

Papers