Consensus Function Based on Matrix Factorization

Article Preview

Abstract:

Clustering ensemble has been known as an effective method to improve the robustness and stability of clustering analysis. Clustering ensemble solves the problem in two steps:firstly,generating a large set of clustering partitions based on the clustering algorithms;secondly,combining them using a consensus function to get the final clustering result. The key technology of clustering ensemble is the proper consensus function. Recent research proposed using the matrix factorization to solve clustering ensemble. In this paper, we firstly analyze some traditional matrix factorization algorithms; secondly, we propose a new consensus function using binary nonnegative matrix factorization (BMF) and give the optimization algorithm of BMF; lastly, we propose the new framework of clustering ensemble algorithm and give some experiments on UCI Machine Learning Repository. The experiments show that the new algorithm is effective and clustering performance could be significantly improved.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

15-19

Citation:

Online since:

November 2012

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Tao Li, A unified view on clustering binary data, Machine Learning, v 62, n 3, pp.199-215, March (2006).

Google Scholar

[2] Javad, Azimi, XiaoliFern, Adaptive Cluster Ensemble Selection, In Proceedings of International Joint Conference on Artificial Intellegence, IJCAI 2009, pp.993-997.

Google Scholar

[3] Fred A L, Jain A K, Data clustering using evidence accumulation, , Proceedings of the 16th International Conference on Pattern Recognition (ICPR 2002). volume 4, 2002: 276-280.

Google Scholar

[4] A. Strehl, J. Ghosh, Cluster Ensemble-a knowledge reuse framework for combining Multiple partitions, Journal on Machine Learning Research(JMLR), vol 3, 2002, pp.583-617.

Google Scholar

[5] Xiaoli Z. Fern, Carla E. Brodley, Solving cluster ensemble problems by bipartite graph partitioning, in Proceedings of 21th International Conference on Machine learning (ICML2004), pp.281-288.

DOI: 10.1145/1015330.1015414

Google Scholar

[6] Vega-Pons, Sandro, Ruiz-Shulcloper, José Source, A survey of clustering ensemble algorithms , International Journal of Pattern Recognition and Artificial Intelligence, vol. 25, pp.337-372, May (2011).

DOI: 10.1142/s0218001411008683

Google Scholar

[7] Xiaoli Z. Fern, Carla E. Brodley, Solving cluster ensemble problems by bipartite graph partitioning, in Proceedings of 21th International Conference on Machine learning (ICML2004), pp.281-288.

DOI: 10.1145/1015330.1015414

Google Scholar

[8] Ayad, H, Kamel, Finding natural clusters using multi-clusterer combiner based on shared nearest neighbors", Proceedings of the 4th International Workshop on Multiple Classifier Systems (MCS, 03), 2003. Volume 2709 of Lecture Notes in Computer Science. Springer, 2003: 166-175.

DOI: 10.1007/3-540-44938-8_17

Google Scholar

[9] Chris Ding, Xiaofeng He, Horst D. Simon. On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering"Proc. SIAM Int'l Conf. Data Mining (SDM, 05), pp: 606-610, April (2005).

DOI: 10.1137/1.9781611972757.70

Google Scholar

[10] Tao Li , Chris Ding. The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering"Proc. IEEE Int'l Conf. on Data Mining (ICDM, 06) 2006. pp.362-371.

DOI: 10.1109/icdm.2006.160

Google Scholar

[11] Zhongyuan Zhang, Tao Li, Chris Ding, Xian-Wen Ren, Xiangsun Zhang. Binary matrix factorization for analyzing gene expression data,. Data Min. Knowl. Discov. 20(1): 28-52 (2010).

DOI: 10.1007/s10618-009-0145-2

Google Scholar

[12] Tao Li, Chris Ding, and Michael I. Jordan. Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization,. In ICDM, (2007).

DOI: 10.1109/icdm.2007.98

Google Scholar