Multi Queries Methods of the Chinese-English Bilingual Plagiarism Detection

Hong Ye Chen; Phil Vines

doi:10.4028/www.scientific.net/AMM.462-463.1158

Paper Titles

Design on HEVC Streaming Media Player Based for Android
p.1127

A Robust Audio Information Hiding Algorithm Based on DCT and DWT
p.1131

The Application of Network-Aided Teaching in College Deaf Students
p.1136

Test of the Run-Time Infrastructure Software
p.1140

Proxy-Based Security-Feedback Trust Model in MP2P Network
p.1144

WebRTC-Based Video Communication Application
p.1152

Multi Queries Methods of the Chinese-English Bilingual Plagiarism Detection
p.1158

Study on Curriculum System Design of Enterprise E-Learning with Digital Learning Concept
p.1163

The Payment Solutions for Campus Card Based on Improved SET Protocol
p.1168

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 462-463Multi Queries Methods of the Chinese-English...

Multi Queries Methods of the Chinese-English Bilingual Plagiarism Detection

Abstract:

Cross-language plagiarism detection identifies and extracts plagiarized text in a multilingual environment. In recent years, there has been a significant amount of work done involving English and European text. However, somewhat less attention has been paid to Asia languages. We compared a number of different strategies for Chinese-English bilingual plagiarism detection. We present methods for candidate document retrieval and compare four methods: (i) document keywords based, (ii) intrinsic plagiarism based, (iii) headers based, and (iv) machine translation queries. The results of our evaluation indicated that keywords based queries, the simplest and most efficient approach, gives acceptable results for newspaper articles. We also compared different percentage of keywords based query, and the results indicated that putting 50% keywords into queries can obtain the satisfied candidate documents set.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 462-463)

Pages:

1158-1162

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.462-463.1158

Citation:

Cite this paper

Online since:

November 2013

Authors:

Hong Ye Chen*, Phil Vines

Keywords:

Cross Language, Evaluation, Plagiarism Detection, Queries Method

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] Martin Potthast, et al. Cross-language plagiarism detection. Lang Resource & Evaluation. (2011), 45: 45-62.

Google Scholar

[2] McCabe, D. Research Report of the Center for Academic Integrity. http: /www. academicintegrity. org. (2005).

Google Scholar

[3] Andrew Jacobs, New York Times, October 6, (2010).

Google Scholar

[4] Clough, P. Old and new challenges in automatic plagiarism detection. National UK Plagiarism Advisory Service, http: /www. ir. shef. ac. uk/cloughie/papers/pas_plagiarism. pdf. (2003).

Google Scholar

[5] Barro´n-Ceden˜o, A., Rosso, P., Pinto, D., & Juan A. On cross-lingual plagiarism analysis using a statistical model. ECAI 2008 workshop on uncovering plagiarism, authorship, and social software misuse (PAN 08) (p.9–13). Patras, Greece. (2008).

Google Scholar

[6] P. Vossen, Ed. EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Kluwer, Dordrecht, The Netherlands. (1998).

DOI: 10.1007/978-94-017-1491-4_1

Google Scholar

[7] Potthast, M., Stein, B., & Anderka, M. A Wikipedia-based multilingual retrieval model. 30th European conference on IR research, ECIR 2008, Glasgow , volume 4956 LNCS of Lecture Notes in Computer Science (p.522–530). Berlin: Springer. (2008).

DOI: 10.1007/978-3-540-78646-7_51

Google Scholar

[8] Huafu Ding, Lili Quan, Haoliang Qi. The Chinese-English Bilingual Sentence Alignment based on Length. 2011 International Conference on Asia Language Processing. pp.201-204, (2011).

DOI: 10.1109/ialp.2011.70

Google Scholar

[9] Manning, C.D., Raghavan, P., Sch¨utze, H. Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK . (2008).

Google Scholar

[10] Simon Suchomel, Jan Kasprzak, Michal Brandejs. Three way search engine queries with multi-feature document comparison for plagiarism detection. Notebook for PAN at CLEF. (2012).

Google Scholar

[11] Eissen, S.M.Z., Stein, B. Intrinsic plagiarism detection. Proceedings of the European Conference on Information Retrieval (ECIR-06) . (2006).

Google Scholar