Cluster Center Initialization Parallel Algorithm for K-Means Algorithm

Article Preview

Abstract:

K-Means algorithm is a one of the most famous unsupervised clustering algorithm. It has many disadvantages, such as sensitivity to the initial clustering centers and computes all the data points multiple times when facing the increasing data volume. In order to overcome the above limitations, this paper proposes to make use of density idea to find k cluster centers by adjusting the threshold. Finally, we design and implementation of the K-Means algorithm on the modern Graphic Processing Unit (GPU). The ratio of distance between classes to distance within classes and speedup are used as evaluation criteria. The experiments indicate that the proposed algorithm significantly improves the stability and efficiency of K-Means algorithm.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 989-994)

Pages:

2169-2172

Citation:

Online since:

July 2014

Keywords:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Chunfeng Wang and Yongzheng Tang: submitted to Computer Engineering and Applications (2011), In Chinese.

Google Scholar

[2] J. MacQueen: Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematics Statistics and Science. pp.281-297(1967).

Google Scholar

[3] M. Yedla, S.R. Pathakota, T.M. Srinivasa: submitted to International Journal of Computer Science and Information Technologies (2010).

Google Scholar

[4] P. Kang, S. Cho: K-means clustering seeds initialization based on centrality, sparsity, and isotropy. Proc of the 10th International Conference on Intelligent Data Engineering and Automated Learning Berlin: Springer. pp.109-117(2009).

DOI: 10.1007/978-3-642-04394-9_14

Google Scholar

[5] L.E.E. WonHee, S.S. Lee, A.N. Dong-Un: submitted to IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS (2013).

Google Scholar

[6] Hongtao Bai, Lili He, et al: K-Means on commodity gpus with cuda. Computer Science and Information Engineering. 2009 WRI World Congress on IEEE. pp.651-655 (2009).

DOI: 10.1109/csie.2009.491

Google Scholar

[7] You Li, Kaiyong Zhao, et al: submitted to Journal of Computer and System Sciences (2013).

Google Scholar

[8] E. Kijsipongse: Dynamic load balancing on GPU clusters for large-scale K-Means clustering. Computer Science and Software Engineering. pp.346-350(2012).

DOI: 10.1109/jcsse.2012.6261977

Google Scholar