Outlier Detection Clustering Algorithm Based on Density

Article Preview

Abstract:

K-means is a classic algorithm of clustering analysis and widely applied to various data mining fields. Traditional K-means algorithm selects the initial centroids randomly, so the clustering result will be affected by the noise points, and the clustering result is not stable. For this problem, this paper proposed a k-means algorithm based on density outlier detection. The algorithm firstly detected the outliers with the density model and avoided selecting outliers as the initial cluster centers. After clustering the non outlier, according to distance of the outlier to each centroids, the algorithm distributed the outliers to the corresponding clustering. The algorithm effectively reduced the influence of outliers to K-means and improved the accuracy of clustering result. The experimental result demonstrated that this algorithm can effectively improve the accurate rate and stability of the clustering.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1808-1812

Citation:

Online since:

January 2015

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2015 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] T. Velmurugan , T. Santhanam, A Survey of Partition based Clustering Algorithms in Data Mining: An Experimental Approach, Information Technology Journal, vol. 10, no. 3,  pp.478-484, (2011).

DOI: 10.3923/itj.2011.478.484

Google Scholar

[2] J. Y. Xie , S. Jing, W. X. Xie, An Efficient Global K-means Clustering Algorithm, Journal of Computers,  vol. 6 , no. 2,  pp.271-279, (2011).

Google Scholar

[3] F. Khan, An initial seed selection algorithm for k-means clustering of georeferenced data to improve replicability of cluster assignments for mapping application, Applied Soft Computing Journal, vol. 12, no. 11, pp.3698-3700, (2012).

DOI: 10.1016/j.asoc.2012.07.021

Google Scholar

[4] S. S. Khan, A. Ahmad, Cluster center initialization algorithm for K-Means clustering, Pattern Recogintion Letters, vol. 25, no. 11, pp: 1293-1320, (2004).

DOI: 10.1016/j.patrec.2004.04.007

Google Scholar

[5] C. P. Hu, X. L. Qin, A Density-Based Local Outlier Detecting Algorithm, Journal of Computer Research and Development, vol. 46, no. 17, pp.2110-2116, (2010).

Google Scholar

[6] Y. F. Zhang, J. L. Mao, Z. Y. Xiong, An Improved K-means Algorithm, Computer Application, vol. 8, no. 23, pp.31-34, (2003).

Google Scholar

[7] Y. Yang, X. Long , B. Jiang, K-Means Method for Grouping in Hybrid MapReduce Cluster, Journal of Computers, vol. 8, no. 10, pp.2648-2655, (2013).

Google Scholar

[8] K. Mumtaz, K. Duraiswamy, A Novel Density based improved k-means Clustering Algorithm-Dbkmeans, International Journal on Computer Science and Engineering, vol. 2, no. 2, pp.213-218, (2010).

Google Scholar

[9] F. Rehm, F. Klawonn, R. Kruse, A Novel Approach to Noise Clustering for Outlier Detection, Soft Computing, vol. 11, no. 5, pp.489-494, (2007).

DOI: 10.1007/s00500-006-0112-4

Google Scholar

[10] C. H. Li, Z. H. Sun, GridOF: An efficient Outlier Detection Algorithm for Very Large Datasets, Journal of Computer Research and Development, vol. 40, No. 11, pp: 1586-1592, (2003).

Google Scholar

[11] W. X. Zhang,Y. Wei, Detection Algorithm for Local Outliers Based on Density, Computer and Digital Engineering, vol. 38, no. 10, pp.11-14, (2010).

Google Scholar