An Improved Text Similarity Calculation Algorithm Based on VSM

Lian Li; Ai Hong Zhu; Tao Su

doi:10.4028/www.scientific.net/AMR.225-226.1105

Paper Titles

Median Filter based Manifold Ranking Approach for Robust Image Annotation
p.1088

A Feasible Algorithm for Designing Biorthogonal Binary Finitely Supported Multiwavelet Functions
p.1092

A Novel Improved Edge Detection Method
p.1096

Harmony Search Algorithm and its Application to Product Module Identification
p.1100

An Improved Text Similarity Calculation Algorithm Based on VSM
p.1105

An Algorithm of Synchronous Mining Frequent Neighboring Class Set with Constraint Class Set
p.1109

Arithmetic Expression Evaluation in Membrane Computing with Priority
p.1115

Combines Speed Servocontrol Based on RBF
p.1120

Analysis of Regenerative Index of Depreciated Aluminum Products
p.1125

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 225-226An Improved Text Similarity Calculation Algorithm...

An Improved Text Similarity Calculation Algorithm Based on VSM

Abstract:

Text similarity calculation is a key technology in the fields of text clustering, Web intelligent retrieval and natural language processing etc. Because the traditional text similarity calculation algorithm does not consider the affect of same feature words between texts, sometimes this algorithm may lead to inaccurate results. To solve this problem, this paper gives an improved text similarity calculation algorithm. Considering that the amount of same feature words reflects two texts’ similarity in some extent, the improved algorithm adds in the coverage measured parameter, which effectively reduces the interference of texts with lower similarity. The simulation and experimental results verify the improved algorithm’s correctness and effectiveness.

You might also be interested in these eBooks

Advanced Research on Automation, Communication, Architectonics and Materials

View Preview

Info:

Periodical:

Advanced Materials Research (Volumes 225-226)

Pages:

1105-1108

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.225-226.1105

Citation:

Cite this paper

Online since:

April 2011

Authors:

Lian Li, Ai Hong Zhu, Tao Su

Keywords:

Cosine, Coverage Degree, Text Similarity, Vector Space Model

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] ZHANG Xia, WANG Jiandong and GU Haihua: Improvement of similarity measure method. Computer Engineering and Applications, 46(19): 141-144(2010).

Google Scholar

[2] Yue Xiaoguang . etc: Design and implementation of Chinese word segmentation system Based. NET. Control & Automation, 26 (4-3) : 214-216(2010).

Google Scholar

[3] Zhang Huaping: Institute of Computing Technology, Chinese Lexical Analysis System (ICTCLAS). http: /www. nlp. org. cn/project/project. php?proj_id=6.

Google Scholar

[4] Li Zhongyuan, Yang Shouwen: Improvement of Weight of Web Page Features in Calculation Based on VSM. Computer and Modernization, 178: 134-140(2010).

Google Scholar