A Token Oriented Measurement Method of Source Code Similarity

Article Preview

Abstract:

In order to help teachers to identify plagiarism in student assignment submissions among students’ Source code quickly and accurately, this paper discusses a measurement method of Source code similarity. In the proposed algorithm, firstly, both of token oriented edit distance (TD) and token oriented length of longest common subsequence (TLCSLen) is calculated; secondly, considering the TD and TLCSLen, a similarity calculation formula is given to measure similarity of Source code; Thirdly, a dynamic and variable similarity threshold is set to determine whether there is plagiarism between Source codes, which ensure a relatively reasonable judgment of plagiarism. This method has been applied to the university's programming course work online submission system and online examination system. Practical application results show that this method can identify similar Source code timely, effectively and accurately.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

899-902

Citation:

Online since:

October 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Edward J L: Metrics-based Plagiarism Monitoring. Journal of Computing Sciences in Colleges, Vol. 16(4) (2001), pp.253-261.

Google Scholar

[2] Yamamoto T, Matsushita M, Kamiya T, et al: Measuring similarity of large software systems based on source code correspondence. Product Focused Software Process Improvement (Springer Berlin Heidelberg, Berlin 2005), pp.530-544.

DOI: 10.1007/11497455_41

Google Scholar

[3] L. Prechelt, G. Malpohl, and M. Philippsen: Jplag: Finding plagiarisms among a set of programs, Technical Report, Fakultat fur Informatik, Universitat Karlsruhe, Germany, (2000).

Google Scholar

[4] M. J. Wise, YAP3: Improved detection of similarities in computer program and other texts, SIGCSEB: SIGCSE Bulletin (ACM Special Interest Group on Computer Science Education), 28, (1996).

DOI: 10.1145/236462.236525

Google Scholar

[5] Wise, Michael J: Detection of Similarities in Student Programs: YAP'ing may be Preferable to Plague'ing, 23rd SIGCSE Technical Symposium, Kansas City, USA, (1992), pp.268-271.

DOI: 10.1145/135250.134564

Google Scholar

[6] John L Donaldson, Ann-Marie Lancaster, Paul H Sposato: A plagiarism detection system. 12th SIGCSE Technical Symposium, St Louse, Missouri, (1981), pp.21-25.

DOI: 10.1145/800037.800955

Google Scholar

[7] Levenshtein, Vladimir I: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, Vol. 10 (8)(1966). pp.707-710.

Google Scholar

[8] David Maier: The Complexity of Some Problems on Subsequences and Supersequence. J. ACM (ACM Press) Vol. 25 (2) (1978). pp.322-336.

DOI: 10.1145/322063.322075

Google Scholar