본문 바로가기
  • Home

Enhancing the performance of code-clone detection tools using code2vec

  • Journal of Software Assessment and Valuation
  • Abbr : JSAV
  • 2021, 17(1), pp.31-40
  • DOI : 10.29056/jsav.2021.06.05
  • Publisher : Korea Software Assessment and Valuation Society
  • Research Area : Engineering > Computer Science
  • Received : June 5, 2021
  • Accepted : June 20, 2021
  • Published : June 30, 2021

Taeho Um 1 Sung-Moon Hong 1 Joon Hyuk Yang 1 Hyo Seok Jang 1 Doh, Kyung-Goo 1

1한양대학교

Accredited

ABSTRACT

Plagiarism refers to the act of using the original data as if it were one’s own without revealing the source. The plagiarism of source code causes a variety of problems, including legal disputes. Plagiarism in software projects is usually determined by measuring similarity by comparing every pair of source code within two projects. However, blindly comparing every pair has been a huge computational burden, causing a major factor of not using tools of better accuracy. If we can only compare pairs that are probable to be clones, eliminating pairs that are impossible to be clones, we can concentrate more on improving the accuracy of detection. In this paper, we propose a method of selecting highly probable candidates of clone pairs by pre-classifying suspected source-codes using a machine-learning model called code2vec.

Citation status

* References for papers published after 2023 are currently being built.