The Detection of Similar Instance Based on Fingerprint

Article Preview

Abstract:

Detecting similar instance is a research hot spot of Example-Based Machine Translation.The method of Vector Space Model is one of the mainstream detection methods. However, thereare two disadvantages for it: detection speed is very slow and synonym substitution is not accurate.To solve these problems, fingerprint retrieval algorithm is introduced to improve the detectionspeed. A concept of replacement cost is put forward to measure the accuracy of substitutionbetween synonyms. The result shows that this method can not only improve the detection speed butalso produce a certain improvement to the accuracy of the similarity calculation.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

711-714

Citation:

Online since:

March 2015

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2015 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Li ZhengShuan, Meng JunMao, Concise course in machine translation, ShangHai: Shanghai Foreign Language Education Press, (2009).

Google Scholar

[2] Feng ZhiWei, Corpus-based Machine Translation Systems, Terminology Standardization & Information Technology, 57(2010) 28-36.

Google Scholar

[3] Moses. Similarity Estimation Techniques From Rounding Algorithm/STOC. Canada. 380(2002).

Google Scholar

[4] Li Gang, Mao Jin, Fast Duplicate Detection for Chinese Texts Based on Semantic Fingerprint, New Technology of Library and Information Service, 237(2013) 41-48.

Google Scholar

[5] Information on http: /www. cnblogs. com/chenwenbiao/archive/2011/09/12/2174139.

Google Scholar

[6] Li QingHua, Zhao YanBin, Parallel Retrieval Algorithm of Information Based on Vector Space Model, Journal of Chinese Computer Systems, 312(2005) 1560-1566.

Google Scholar

[7] Liu ShaoHui, Dong MingKai. An Approach of Multi-hierarchy Text Classification Based on Vector Space Model, Journal of Chinese Information Processing, 72(2002) 8-17.

Google Scholar