A Graph Clustering Algorithm for the Homology Detection

Article Preview

Abstract:

In order to detect a large number of source program samples which are homologous files (files with plagiarism), a new graph-based cluster detection algorithm is proposed,the algorithm is divided into two phases, in the first phase, proposed algorithm based on the keyword program to calculate pairwise similarity in the detected sample program files,in the second stage,by means of graph clustering algorithm, the results of the first phase is dectected, homologous files (files with plagiarism) will form a cluster. The simulation results shows that the algorithm improved detection rate compare with the traditional homologous files detection algorithm and can determine which files are homologous.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1981-1986

Citation:

Online since:

March 2011

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2011 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] TR 14-03 (2003) An XML plagiarism detection model for procedural programming languages. Iowa State University, IA.

Google Scholar

[2] Grier, S. (1981) A tool that detects plagiarism in Pascal programs. ACM SIGCSE Bull, 13, 21–25.

DOI: 10.1145/953049.800954

Google Scholar

[3] Donaldson, J, Lancaster, A and Sposato, P (1981) A plagiarism detection system. ACM SIGCSE Bull, 13, 15–20.

DOI: 10.1145/953049.800955

Google Scholar

[4] Allen, F and Cocke, J (1976) A program data flow analysis procedure. Commun ACM, 19, 137–147.

Google Scholar

[5] Verco, K. and Wise, M(1996).

Google Scholar

[6] Verco KL, Wise MJ Software for detecting suspected plagiarism: compareing structure and attribute-counting systems. In: Proceedings of the 1st Australian Conference on Computer Science Education. 1996. pp.3-5.

DOI: 10.1145/369585.369598

Google Scholar

[7] Parker, A and Hamblen, J (1989) Computer algorithms for plagiarism detection. IEEE Trans. on Educ., 32, 94–99.

DOI: 10.1109/13.28038

Google Scholar

[8] P.K. Agarwal and C.M. Procopiuc, ªExact and Approximation Algorithms for Clustering, º Proc. Ninth Ann. ACM-SIAM Symp. Discrete Algorithms, pp.658-667, Jan. (1998).

Google Scholar

[9] K. Alsabti, S. Ranka, and V. Singh, ªAn Efficient k-means Clustering Algorithm, º Proc. First Workshop High Performance DataMining, Mar. (1998).

Google Scholar

[10] S. Arora, P. Raghavan, and S. Rao, ªApproximation Schemes for Euclidean k-median and Related Problems, º Proc. 30th Ann. ACM Symp. Theory of Computing, pp.106-113, May (1998).

Google Scholar

[11] S. Arya and D. M. Mount, ªApproximate Range Searching, Computational Geometry: Theory and Applications, vol. 17, pp.135-163, (2000).

Google Scholar