A Practical Algorithm for Plagiarism Detection Based on Search Engine
In the current background of quantitative academic evaluation system in China, many scholars, graduate students are tend to plagiarize from web. To detect plagiarism efficiently, there should be a massive text collection which could be accessed easily, cheaply and quickly. Some algorithms refer to the quickly developing online database, such as Chinese CNKI database. We introduced an algorithm to detect plagiarism quantitatively based on natural language segment and precise retrieval function of search engine. The source text is segmented into sentences with punctuation marks. Each sentence is searched in search engine as a single keyword with quotes. The similarity between source file and web information is computed by the ratio of matched sentences return by search engine. The experiments show that this algorithm is practical and feasible.
Z. L. Qiu and D. L. Xu, "A Practical Algorithm for Plagiarism Detection Based on Search Engine", Applied Mechanics and Materials, Vols. 66-68, pp. 2287-2290, 2011