Semi-Supervised Clustering Algorithm Based on Small Size of Labeled Data

Ming Wei Leng; Xiao Yun Chen; Jian Jun Cheng; Long Jie Li

doi:10.4028/www.scientific.net/AMM.121-126.4675

Paper Titles

Research on the SIFT Algorithm in Image Matching
p.4656

Evaluation Architecture and Method of Quality Survival and Development Index for Product
p.4661

Roll Casting of Recycled AA5182
p.4667

Scene Modeling of Virtual Operation Training System for Tower Cranes
p.4671

Semi-Supervised Clustering Algorithm Based on Small Size of Labeled Data
p.4675

Measurement and Analysis of Summer Outdoor Thermal Environment of Campus Open Space in Guangzhou
p.4680

Simulation and 3D-Solid Modeling of Skew Bevel Gear
p.4685

Simulation and Parametric Analysis of Vegetable Seedling Transplanting Mechanism
p.4690

Simulation and Research of Aircraft Deicing Fluids Deicing Process
p.4695

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 121-126Semi-Supervised Clustering Algorithm Based on...

Semi-Supervised Clustering Algorithm Based on Small Size of Labeled Data

Abstract:

In many data mining domains, labeled data is very expensive to generate, how to make the best use of labeled data to guide the process of unlabeled clustering is the core problem of semi-supervised clustering. Most of semi-supervised clustering algorithms require a certain amount of labeled data and need set the values of some parameters, different values maybe have different results. In view of this, a new algorithm, called semi-supervised clustering algorithm based on small size of labeled data, is presented, which can use the small size of labeled data to expand labeled dataset by labeling their k-nearest neighbors and only one parameter. We demonstrate our clustering algorithm with three UCI datasets, compared with SSDBSCAN[4] and KNN, the experimental results confirm that accuracy of our clustering algorithm is close to that of KNN classification algorithm.

You might also be interested in these eBooks

Frontiers of Manufacturing and Design Science II

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 121-126)

Pages:

4675-4679

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.121-126.4675

Citation:

Cite this paper

Online since:

October 2011

Authors:

Ming Wei Leng, Xiao Yun Chen, Jian Jun Cheng, Long Jie Li

Keywords:

Data Mining (DM), Label Propagation, Semi-Supervised Clustering

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Wagstaff K, Cardie C, Rogers S, Schroedel S. Constrained k-means clustering with background knowledge[C]. Proceedings of the 18th international conference on machine learning(ICML 2001), pp.577-584.

Google Scholar

[2] Leng M, Chen X, Li L. K-means Clustering Algorithm Based on Semi-supervised Learning[J]. Journal of Computational Information Systems, 4(5): 2007-2013, (2008).

Google Scholar

[3] Dang Y, Xuan Z, Rong L, Liu M. A novel initialization method for semi-supervised clustering[C]. Proceedings of the 4th international conference on Knowledge science, engineering and management, LNCS 6291: 317-328.

DOI: 10.1007/978-3-642-15280-1_30

Google Scholar

[4] Lelis L, Sander J. Semi-Supervised Density-Based Clustering[C]. Proceedings of the 9th IEEE international conference on Data Mining(ICDM 2009), pp.842-847.

DOI: 10.1109/icdm.2009.143

Google Scholar

[5] Ruiz C, Spiliopoulou M, Menasalvas E. Density-based semi-supervised clustering[J]. Data Mining and Knowledge Discovery, 21(3): 345-370, (2010).

DOI: 10.1007/s10618-009-0157-y

Google Scholar

[6] Zhao W, He Q, Ma H, Shi Z. Effective semi-supervised document clustering via active learning with instance-level constraints[J]. Knowledge and Information Systems, in press, (2011).

DOI: 10.1007/s10115-011-0389-1

Google Scholar

[7] Huang R, Lam W. An active learning framework for semi-supervised document clustering with language modeling[J]. Data and Knowledge Engineering, 68(1): 49-67, (2009).

DOI: 10.1016/j.datak.2008.08.008

Google Scholar

[8] Grira N, Crucianu M, Boujemaa N. Active semi-supervised fuzzy clustering[J]. Pattern Recognition, 41(5): 1834-1844, (2008).

DOI: 10.1016/j.patcog.2007.10.004

Google Scholar

[9] Kulis B, Basu S, Dhillon I, Mooney R. Semi-supervised gragh clustering: a kernel approach. Machine Learning, 74(1): 1-22, (2009).

DOI: 10.1007/s10994-008-5084-4

Google Scholar

[10] Yin X, Chen S, Hu E, Zhang D. Semi-supervised clustering with metric learning: An adaptive kernel method[J]. Pattern Recognition, 43(4): 1320-1333, (2010).

DOI: 10.1016/j.patcog.2009.11.005

Google Scholar

[11] Baghshah M S, Shouraki S B. Kernel-based metric learning for Semi-supervised clustering [J]. Neurocomputing, 73(7-9) : 1352-1361, (2010).

DOI: 10.1016/j.neucom.2009.12.009

Google Scholar

[12] Chen Y, Rege M, Dong M, Hua J. Non-negative matrix factorization for Semi-supervised data clustering. Knowledge and Information Systems, 17(3): 355-379, (2008).

DOI: 10.1007/s10115-008-0134-6

Google Scholar

[13] Asuncion A, Newman D. UCI machine learning repository. Available online at http: /archive. ics. uci. edu/ml/datasets. htm.

Google Scholar