Research on Clustering Analysis Based on SOM

Article Preview

Abstract:

In this paper, we present an improved text clustering algorithm. It not only maintains the self-organizing features of SOM network, but also makes up the disadvantages of the bad clustering effect caused by the inadequate selection of K-means algorithm. Firstly, data is preprocessed to form vector space model for subsequent process. Then, we analyze the features of original clustering algorithm and SOM algorithm, and plan an improved SOM clustering algorithm to overcome low stability and poor quality of original algorithm. The experimental results indicate that the improved algorithm has a higher accuracy and has a better stability, compared with the original algorithm.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

968-971

Citation:

Online since:

December 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] A Likas, N Vlassis, J J Verbeek (2003). The global k-means algorithm,. Pattern Recognition. Vol. 36, No. 2, pp.451-462.

DOI: 10.1016/s0031-3203(02)00060-2

Google Scholar

[2] Suo H.G., Wang Y.W. (2008). An improved k-means algorithm for document clustering,. Journal of Shandong University (Natural Science). Vol. 43, No. 1, pp.60-64.

Google Scholar

[3] Gong Jing,Li Anming (2008). Clustering Algorithm of One Improved k-Means Chinese Text,. Journal of Hunan University of Technology. Vol. 22, No. 2, pp.52-54.

Google Scholar

[4] Yang Zhanhua, Yang Yan (2006). Document clustering method based on hybrid of SOM and K-means,. Journal of Computer application research. Vol. 18, No. 8, pp.73-79.

Google Scholar

[5] Hui Han etc. Rule-based Word Clustering for Document Metadata Extraction. SAC'05 March 13-17, 2005, Santa Fe, New Mexico, USA.

Google Scholar

[6] E. Stoica, M. A. Hearst, and M. Richardson, Automating creation of hierarchical faceted metadata structures, in Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL 2007), (2007).

Google Scholar

[7] D.R. Cutting, J. Pedersen, D.R. Karger, J.W. Tukey, Scatter/gather: A cluster-based approach to browsing large document collections, Proc. ACM Conf. Res. and Development in Information Retrieval.

DOI: 10.1145/133160.133214

Google Scholar