Paper Title:
XML Document Clustering Based on Spectral Analysis Method
  Abstract

While K-Means algorithm usually gets local optimal solution, spectral clustering method can obtain satisfying clustering results through embedding the data points into a new space in which clusters are tighter. Since traditional spectral clustering method uses Gauss Kernel Function to compute the similarity between two points, the selection of scale parameter σ is related with domain knowledge usually. This paper uses spectral method to cluster XML documents. To consider both element and structure of XML documents, this paper proposes to use path feature to represent XML document; to avoild the selection of scale parameter σ, it also proposes to use Jaccard coefficient to compute the similarity between two XML documents. Experiment shows that using Jaccard coefficient to compute the similarity is effective, the clustering result is correct.

  Info
Periodical
Advanced Materials Research (Volumes 219-220)
Edited by
Helen Zhang, Gang Shen and David Jin
Pages
304-307
DOI
10.4028/www.scientific.net/AMR.219-220.304
Citation
X. Y. Li, "XML Document Clustering Based on Spectral Analysis Method", Advanced Materials Research, Vols. 219-220, pp. 304-307, 2011
Online since
March 2011
Authors
Export
Price
$32.00
Share

In order to see related information, you need to Login.

In order to see related information, you need to Login.

Authors: Ming Wei Leng, Xiao Yun Chen, Jian Jun Cheng, Long Jie Li
Chapter 8: System Modeling and Simulation
Abstract:In many data mining domains, labeled data is very expensive to generate, how to make the best use of labeled data to guide the process of...
4675
Authors: Yin Sheng Zhang, Hui Lin Shan, Jia Qiang Li, Jie Zhou
Chapter 8: Nanomaterials and Nanomanufacturing
Abstract:The traditional K-means clustering algorithm prematurely plunges into a local optimum because of sensitive selection of the initial cluster...
1977
Authors: Bing Chen, Xue Qin Hu, Bei Zhan Wang, Yin Huan Zheng
Chapter 5: Biotechnology, Chemical and Materials Engineering
Abstract:This paper proposed a new hybrid spectral clustering algorithm in which Mean Impact Value (MIV) was used in the cost dimension reduction. The...
695
Authors: Chun Xia Jin, Hai Yan Zhou, Qiu Chan Bai
Chapter 6: Algorithm Design
Abstract:To solve the problem of sparse keywords and similarity drift in short text segments, this paper proposes short text clustering algorithm with...
1716
Authors: Yang Pan, An Hua Chen, Ling Li Jiang
Chapter 3: Mechanical Transmission, Vibration and Noise
Abstract:According to the selection difficulties of initial clustering center of k-means clustering algorithm, this paper proposes a method that is to...
250