Paper Title:
A Text Hybrid Clustering Algorithm Based on HowNet Semantics
  Abstract

Many existing text clustering algorithms overlook the semantic information between words and so they possess a lower accuracy of text similarity computation. A new text hybrid clustering algorithm (HCA) based on HowNet semantics has been proposed in this paper. It calculates the semantic similarity of words by using the words’ semantic concept description in HowNet and then combines it with the method of maximum weight matching of bipartite graph to calculate a semantic-based text similarity. Based on the new text similarity and by combining an improved genetic algorithm with k-medoids algorithm, HCA has been designed. The comparative experiments show that: 1) compared with two existing traditional clustering algorithms, HCA can get better quality and 2) when their text cosine similarity is replaced with the new semantic-based text similarity, all the qualities of the three clustering algorithms can be improved significantly.

  Info
Periodical
Key Engineering Materials (Volumes 474-476)
Edited by
Garry Zhu
Pages
2071-2078
DOI
10.4028/www.scientific.net/KEM.474-476.2071
Citation
Z. Y. Zhu, S. J. Dong, C. L. Yu, J. He, "A Text Hybrid Clustering Algorithm Based on HowNet Semantics", Key Engineering Materials, Vols. 474-476, pp. 2071-2078, 2011
Online since
April 2011
Export
Price
$32.00
Share

In order to see related information, you need to Login.

In order to see related information, you need to Login.

Authors: Yin Sheng Zhang, Hui Lin Shan, Jia Qiang Li, Jie Zhou
Chapter 8: Nanomaterials and Nanomanufacturing
Abstract:The traditional K-means clustering algorithm prematurely plunges into a local optimum because of sensitive selection of the initial cluster...
1977
Authors: Wen Chuan Yang, Jie Liu, Ning Jun Chen
Chapter 3: Materials Processing Technology and Mining Engineering
Abstract:A kind of atypical unexpected incidents hide in complaint text accompany with the telecom services. This atypical unexpected incident is...
360
Authors: Chun Xia Jin, Hai Yan Zhou, Qiu Chan Bai
Chapter 6: Algorithm Design
Abstract:To solve the problem of sparse keywords and similarity drift in short text segments, this paper proposes short text clustering algorithm with...
1716
Authors: Ying Meng, Ke Luo, Jian Hua Liu
Chapter 17: Automatic Control Technology
Abstract:Because of the traditional K-medoids clustering algorithm the initial clustering center sensitive, the global search ability is poor, easily...
2106
Authors: Chang Sheng Cheng, Yan Meng Shang
Chapter 2: Signal Processing and Measurement
Abstract:The driving force of a genetic algorithm is the fitness function. The traditional fitness function based on the error sum of squares has a...
94