A Dynamic Genetic Algorithm for Clustering Problems

Article Preview

Abstract:

Due to many of the clustering algorithms based on GAs suffer from degeneracy and are easy to fall in local optima, a novel dynamic genetic algorithm for clustering problems (DGA) is proposed. The algorithm adopted the variable length coding to represent individuals and processed the parallel crossover operation in the subpopulation with individuals of the same length, which allows the DGA algorithm clustering to explore the search space more effectively and can automatically obtain the proper number of clusters and the proper partition from a given data set; the algorithm used the dynamic crossover probability and adaptive mutation probability, which prevented the dynamic clustering algorithm from getting stuck at a local optimal solution. The clustering results in the experiments on three artificial data sets and two real-life data sets show that the DGA algorithm derives better performance and higher accuracy on clustering problems.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1884-1893

Citation:

Online since:

September 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Anil K. Jain, Data clustering: 50years beyond k-means: Pattern Recognition Letters Vol. 31 (2010), p.651–666.

DOI: 10.1016/j.patrec.2009.09.011

Google Scholar

[2] J. Handl, J. Knowles, D. B. Kell, Computational cluster validation in post-genomic data analysis: Bioinformatics Vol. 21 (2005), p.3201–3212.

DOI: 10.1093/bioinformatics/bti517

Google Scholar

[3] M. Ester, H. P. Kriegel, J. Sander, Xiaowei Xu, A density-based algorithm for discovering clusters in large spatial data bases with noise: Proceedings of the 2nd international conference on knowledge discovery and data mining 1996, p.226–231.

DOI: 10.1109/icde.1998.655795

Google Scholar

[4] Sara C. Madeira and A. L. Oliveira, Biclustering algorithms for biological data analysis: a survey: IEEE Transactions on Computational Bioinformatics Vol. 1 (2004), p.24–45.

DOI: 10.1109/tcbb.2004.2

Google Scholar

[5] S. Dehuri, A. Ghosh, R. Mall, Genetic algorithms for multi-criterion classification and clustering in data mining: International Journal of Computing and Information Systems Vol. 4 (2006), p.143–154.

Google Scholar

[6] Eduardo Raul Hruschka, Ricardo J.G.B. Campello and Alex. A. Freitas et al, A survey of evolutionary algorithms for clustering: IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews Vol. 39 (2009), p.133–155.

DOI: 10.1109/tsmcc.2008.2007252

Google Scholar

[7] Dongxia Chang, Xianda Zhang, Changwen Zheng, A genetic algorithm with gene rearrangement for K-means clustering: Pattern Recognition Vol. 42 (2009), p.1210 – 1222.

DOI: 10.1016/j.patcog.2008.11.006

Google Scholar

[8] Lianjiang Zhu, Bingxian Ma and Xuequan Zhao, Clustering validity analysis based on silhouette coefficient: Journal of Computer Applications Vol. 30 (2010), pp.139-141.

Google Scholar

[9] N. R. Pal and J. C. Bezdek, On cluster validity for the fuzzy c-means model: IEEE Transactions on Fuzzy Systems Vol. 3 (1995), p.370–379.

DOI: 10.1109/91.413225

Google Scholar

[10] T. James, M. Vroblefski, Q. Nottingham, A hybrid grouping genetic algorithm for the registration area planning problem: Computer Communications Vol. 30 (2007), p.2180–2190.

DOI: 10.1016/j.comcom.2007.04.018

Google Scholar

[11] S. Bandyopdhyay and U. Maulik, An evolutionary technique based on K-means algorithm for optimal clustering in RN: Information Sciences Vol. 146 (2002), p.221–237.

DOI: 10.1016/s0020-0255(02)00208-6

Google Scholar

[12] Information on http: /archive. ics. uci. edu/ml.

Google Scholar