Knowledge-Based Genetic Algorithm for Multidimensional Data Clustering

Article Preview

Abstract:

In this paper, a new approach of genetic algorithm called knowledge-based Genetic Algorithm (KBGA-Clustering) is proposed for multidimensional data clustering . Basically, this method adopts knowledge of what called as appropriate cluster centre for a fixed number of k-cluster. The chromosome which has inappropriate genes will be penalised with maximum value to prohibit it in the next generation. The experimental result is also provided for KBGA-Clustering and Genetic Algorithm-Clustering (GA-Clustering) to present the performance. Based on the observation, KBGA-Clustering presents better performance and more optimum solution compared to conventional GA-Clustering.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

277-280

Citation:

Online since:

August 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Li, C. S. Cluster center initialization method for K-means algorithm over data sets with two clusters, Procedia Engineering. (2011) 324-328.

DOI: 10.1016/j.proeng.2011.11.2650

Google Scholar

[2] Everitt, B., Landau, S., Lesse, M. and Stahl, D. Cluster Analysis 5th Edition. John Wiley & Sons. (2011).

Google Scholar

[3] Larose, D. T. Discovering Knowledge in Data. John Wiley & Sons. (2005).

Google Scholar

[4] Erisoglu, M., Cailis, N., Servi, T., Erisoglu, U. and Topaksu, M. A new algorithm for initial cluster centers in K-means algorithm, Pattern Recognition Letters. (2011) 1701-1705.

DOI: 10.1016/j.patrec.2011.07.011

Google Scholar

[5] Cao, F., Liang, J. and Jiang, G. An initialization method for the K-Means algorithm using neighborhood model, Computers and Mathematics with Applications. (2009) 474-483.

DOI: 10.1016/j.camwa.2009.04.017

Google Scholar

[6] Mac Queen, J. Some methods for classification and analysis of multivariate observations, Proc. 5th Berkeley Symp. Math. Stat and Prob. (1967) 281-297.

Google Scholar

[7] Scripps, J. and Tan, P.N. Constrained overlapping clusters: minimizing the negative effects of bridge-nodes, Statistical Analysis and Data Mining. (2010) 20-37.

DOI: 10.1002/sam.10066

Google Scholar

[8] Pen, J.M., Lozano, J. and Larraga, P. An empirical comparisson of four initialization methods for the K-means algorithm, Pattern Recognition Letters. (1999) 1027-1040.

DOI: 10.1016/s0167-8655(99)00069-0

Google Scholar

[9] Kumar, A.S. Knowledge Discovery Practices and Emerging Applications of Data. Idea Group Inc. (2011).

Google Scholar

[10] Laan, M., Pollard, K. and Bryan, J. A new partitioning around medoids algorithm, Journal of Statistical Computation and Simulation. 73 (2003) 575-584.

DOI: 10.1080/0094965031000136012

Google Scholar

[11] Feng, M. and Wang, Z. A genetic K-means clustering algorithm based on the optimized initial cluster, Computer and information science. 4 (2011).

Google Scholar

[12] Coley, D.A. An Introduction to Genetic Algorithms for Scientists and Engineers. World Scientific. (1999).

Google Scholar

[13] Mitchell, M. An Introduction to Genetic Algorithms. MIT Press. (1998).

Google Scholar

[14] Chang, D.X., Zhang, X.D. and Zheng, C.W. A genetic algorithm with gene rearrangement for K-means clustering, Pattern Recognition. (2009) 1210-1222.

DOI: 10.1016/j.patcog.2008.11.006

Google Scholar