A Text Hybrid Clustering Algorithm Based on HowNet Semantics

Abstract:

Article Preview

Many existing text clustering algorithms overlook the semantic information between words and so they possess a lower accuracy of text similarity computation. A new text hybrid clustering algorithm (HCA) based on HowNet semantics has been proposed in this paper. It calculates the semantic similarity of words by using the words’ semantic concept description in HowNet and then combines it with the method of maximum weight matching of bipartite graph to calculate a semantic-based text similarity. Based on the new text similarity and by combining an improved genetic algorithm with k-medoids algorithm, HCA has been designed. The comparative experiments show that: 1) compared with two existing traditional clustering algorithms, HCA can get better quality and 2) when their text cosine similarity is replaced with the new semantic-based text similarity, all the qualities of the three clustering algorithms can be improved significantly.

Info:

Periodical:

Key Engineering Materials (Volumes 474-476)

Edited by:

Garry Zhu

Pages:

2071-2078

DOI:

10.4028/www.scientific.net/KEM.474-476.2071

Citation:

Z. Y. Zhu et al., "A Text Hybrid Clustering Algorithm Based on HowNet Semantics", Key Engineering Materials, Vols. 474-476, pp. 2071-2078, 2011

Online since:

April 2011

Export:

Price:

$35.00

In order to see related information, you need to Login.

In order to see related information, you need to Login.