A Modified Version of the K-Means Algorithm Based on the Shape Similarity Distance

Article Preview

Abstract:

K-means Algorithm is a popular method in cluster analysis, and it is most based on the Euclidean distance. In this paper, a modified version of the K-means algorithm based on the shape similarity distance (SSD-K-means) is presented. The shape similarity distance is one kind of non-metric distance measure for similarity estimation based on the characteristic of differences. To demonstrate the effectiveness of the method we proposed, this new algorithm has been tested on three shape data datasets. Experiment results prove that the performance of the SSD-K-means is better than those of the classical K-means algorithm based on the traditional Euclidean and Manhattan distances.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1064-1068

Citation:

Online since:

October 2013

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] A.K. Jain and R.C. Dubes, Algorithms for Clustering, Englewood Cliffs, N.J.: Prentice Hall, (1988).

Google Scholar

[2] Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, 2nd ed., Morgan Kaufmann Publishers, (2006).

Google Scholar

[3] Mu-Chun Su, and Chien-Hsing Chou, A modified version of the K-means algorithm with a distance based on cluster symmetry, IEEE Trans. Pattern Analysis and Machine Intelligence, Jun. 2001, p.674–680.

DOI: 10.1109/34.927466

Google Scholar

[4] Czink, N., Cera, P., Salo, J., Bonek, E., Nuutinen, J. -P., and Ylitalo, J. Improving clustering performance using multipath component distance, Electronics Letters, 5 Jan. 2006, pp.33-35.

DOI: 10.1049/el:20063917

Google Scholar

[5] Wang D F, Yeung D S, Tsang E C C. Weighted mahalanobis distance kernels for support vector machines, IEEE Trans. Neural Networks, vol. 18, 2007, pp.1453-1462.

DOI: 10.1109/tnn.2007.895909

Google Scholar

[6] Zhong LI, Jinsha YUAN, Hong YANG and Ke ZHANG. K-mean Algorithm with a Distance Based on the Characteristic of Differences, WiCOM 2008, Dalian, China, Oct. (2008).

Google Scholar

[7] Zhong LI, Tiefeng ZHANG. Random data clustering analysis based on different similarity measures, Journal of North China Electric Power University, vol. 39, Dec. 2012, pp.45-48, 64.

Google Scholar

[8] Fisher, R.A., The use of multiple measurements in taxonomic problems, J. Annual Eugenics. 1936, vol. 7, pp.179-188.

Google Scholar

[9] P. W. Frey , D. J. Slate., Letter recognition using holland-style adaptive classifiers, Machine Learning, March 1991, vol. 6, no. 2, pp.161-182.

DOI: 10.1007/bf00114162

Google Scholar

[10] Wolberg WH, Street WN. Mangasarian OL., Image analysis and machine learning applied to breast cancer diagnosis and prognosis, Analytical and Quantitative Cytology and Histology, 1995, vol. 17, no. 2, pp.77-87.

Google Scholar