Stochastic Gradient Descent Based K-Means Algorithm on Large Scale Data Clustering

Article Preview

Abstract:

With the rapid advance of data collection and storage technique, it is easy to acquire tens of millions or even billions of data sets. How to explore and exploit the useful or interesting information for human beings from these data sets has become an urgent issue. Traditional k-means clustering algorithm has been widely used in data mining community. First, randomly initialize k clustering centres. Then, all instances are classified into k different classes according to their distances to clustering centres. Lastly, update the clustering centres by the mean of its corresponding constituent instances. This whole process will be iterated until convergence. Obviously, at each iteration, distance matrix from all instances to k clustering centres must be calculated which will cost so much time when encounter large scale data sets. To address this issue, in this paper, we proposed a fast optimization algorithm based on stochastic gradient descent (SGD). At each iteration, randomly choose an instance, search its corresponding clustering centre and then update it immediately. Experimental results show that our proposed method achieves a competitive clustering results with less time cost.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1342-1345

Citation:

Online since:

November 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Aggarwal C C, Zhai C X. A survey of text classification algorithms [M]/Mining text data. Springer US, 2012: 163-222.

DOI: 10.1007/978-1-4614-3223-4_6

Google Scholar

[2] Joachims T. Making large scale SVM learning practical[J]. (1999).

Google Scholar

[3] Cesari L. Optimization—theory and applications [M]. Springer, (1983).

Google Scholar

[4] Meir R. Empirical risk minimization versus maximum-likelihood estimation: a case study [J]. Neural computation, 1995, 7(1): 144-157.

DOI: 10.1162/neco.1995.7.1.144

Google Scholar

[5] Battiti R. First-and second-order methods for learning: between steepest descent and Newton's method [J]. Neural computation, 1992, 4(2): 141-166.

DOI: 10.1162/neco.1992.4.2.141

Google Scholar

[6] Bottou L. Large-scale machine learning with stochastic gradient descent [M]/Proceedings of COMPSTAT'2010. Physica-Verlag HD, 2010: 177-186.

DOI: 10.1007/978-3-7908-2604-3_16

Google Scholar

[7] Hartigan J A, Wong M A. Algorithm AS 136: A k-means clustering algorithm [J]. Applied statistics, 1979: 100-108.

DOI: 10.2307/2346830

Google Scholar