Software Source Code Plagiarism and Direction Detection Based on PDG

Article Preview

Abstract:

Because of the complexity of the software development, some software developers may plagiarize source code that comes from other projects or open source software in order to shorten development cycle. Usually the copyist would modify and disguise the source code copied to escape plagiarism detection. So far, most algorithms cant completely detect the source disguised by the copyist, especially cant exactly distinguish between the source code and the plagiaristic code. In this paper, we summarize and analyze the effect of disguised source to the detection process, design the strategy to remove the effect of disguised source, and propose a PDG-based software source code plagiarism detection algorithm. The algorithm can detect the existence of disguised source, so as to find out source code plagiarism. And we propose a heuristic rule to make the detection algorithm have the ability to give the plagiarism direction. Any existing algorithm does not have this function. We prove the availability of the algorithm by experiment.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1172-1177

Citation:

Online since:

August 2013

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] J. Ferrante, K. J. Ottenstein, and J. D. Warren. The program dependence graph and its use in optimization. ACM Trans. Program Lang. Syst., 9(3): 319–349, (1987).

DOI: 10.1145/24039.24041

Google Scholar

[2] T. Kamiya, S. Kusumoto and K. Inoue, CCFinder: AMultilinguistic Token-Based Code Clone Detection System for Large Scale Source Code, IEEE Transactions on Software Engineering, 28(7): 654-670 (2002).

DOI: 10.1109/tse.2002.1019480

Google Scholar

[3] I. Baxter, A. Yahin, L. Moura and M. Anna, Clone Detection Using Abstract Syntax Trees, in: Proceedings of the 14th International Conference on Software Maintenance, ICSM 1998, pp.368-377 (1998).

DOI: 10.1109/icsm.1998.738528

Google Scholar

[4] C. Liu, C. Chen, J. Han and P. Yu, GPLAG: Detection of Software Plagiarism by Program Dependence Graph Analysis, in: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, pp.872-881 (2006).

DOI: 10.1145/1150402.1150522

Google Scholar

[5] J. Johnson, Identifying Redundancy in Source Code Using Fingerprints, CASCON 1993, p.171–183 (1993).

Google Scholar

[6] J. Johnson, Visualizing Textual Redundancy in Legacy Source Proceedings of the 1994 Conference of the Centre for Advanced Studies on Collaborative research, CASCON 2004, pp.171-183 (1994).

Google Scholar