p.2049
p.2054
p.2058
p.2064
p.2071
p.2079
p.2085
p.2089
p.2095
A Text Hybrid Clustering Algorithm Based on HowNet Semantics
Abstract:
Many existing text clustering algorithms overlook the semantic information between words and so they possess a lower accuracy of text similarity computation. A new text hybrid clustering algorithm (HCA) based on HowNet semantics has been proposed in this paper. It calculates the semantic similarity of words by using the words’ semantic concept description in HowNet and then combines it with the method of maximum weight matching of bipartite graph to calculate a semantic-based text similarity. Based on the new text similarity and by combining an improved genetic algorithm with k-medoids algorithm, HCA has been designed. The comparative experiments show that: 1) compared with two existing traditional clustering algorithms, HCA can get better quality and 2) when their text cosine similarity is replaced with the new semantic-based text similarity, all the qualities of the three clustering algorithms can be improved significantly.
Info:
Periodical:
Pages:
2071-2078
Citation:
Online since:
April 2011
Authors:
Price:
Сopyright:
© 2011 Trans Tech Publications Ltd. All Rights Reserved
Share:
Citation: