A Note on Cluster Validity Indices SV and OS

Article Preview

Abstract:

Krista Rizman Zalik and Borut Zalik proposed indices SV and OS associating with separation and compactness or overlap. Compactness and overlap were calculated by a few data points of a cluster, which makes the indices able to identify the number of clusters underlying the data set of different sizes and densities. However, the measure of overlap depends on two undefined variants and the measure of compactness are build on randomly selected ten percent data points of a cluster, which makes them difficult to compute and unpractical. This paper supposes to measure the compactness using ten percent of data points of a cluster that are farthest away from the center of the cluster, and revises the measure of overlap so that two undefined variants are eliminated from the measure of overlap. Experiments show that the modified index SV can identify the optional number of clusters underlying the data set of different size and prefers to the index OS.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2199-2202

Citation:

Online since:

July 2013

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] R.C. Dubes, A.K. Jain, Clustering techniques: the user's dilemma, Pattern Recognition 8 (1976) p.247–260.

DOI: 10.1016/0031-3203(76)90045-5

Google Scholar

[2] Krista Rizman Zalik, Borut Zalik, Validity index for clusters of different sizes and densities, Pattern Recognition Letters 32 (2011) p.221–234.

DOI: 10.1016/j.patrec.2010.08.007

Google Scholar

[3] Dunn, J.C., A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybernet. 3 (1973), p.32–57.

DOI: 10.1080/01969727308546046

Google Scholar

[4] Dunn, J.C., Well separated clusters and optimal fuzzy partitions. J. Cybernet. 4 (1974. ), 95–104.

Google Scholar

[5] MacQueen, J.B. Some methods for classification and analysis of multivariate observation. Berkeley Symposium on Mathematical Statistics and Probability[C], (1967) p.281–297.

Google Scholar

[6] Stephen J. Redmond, Conor Heneghan. A method for initialising the K-means clustering algorithm using kd-trees [J]. Pattern Recognition Letters, 28 (8), (2007), p.965–973.

DOI: 10.1016/j.patrec.2007.01.001

Google Scholar

[7] http: /www. ics. uci. edu/~mlearn/MLRepository. html (1993).

Google Scholar

[8] Chih-Chung Chang and Chih-Jen Lin, LIBSVM http: /www. csie. ntu. edu. tw/ ~cjlin/libsvm (2001).

Google Scholar