Data Stream Clustering Algorithm Based on Affinity Propagation and Density

Article Preview

Abstract:

Data stream clustering is an important issue in data steam mining. In the field of data stream analysis, conventional methods seem not quite efficient. Because neither they can adapt to the dynamic environment of data stream, nor the mining models and result s can meet users’ needs. An affinity propagation and grid based clustering method is proposed to effectively address the problem. The algorithm applies AP clustering on each partition of the data stream to generate reference point set, and subsequently density based clustering is applied to these reference points to get the clustering result of each periods. Theoretic analysis and experimental results show it is effective and efficient.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

444-449

Citation:

Online since:

June 2011

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2011 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Henzinger MR, Raghavan P, Rajagopalan S. Computing on data streams[DE/OL]. http: /gatekeeper. research. compaq. com/pub/DEC/SRC/technical-notes/abstracts/src-tn-1998-011, (1998).

Google Scholar

[2] Frey B J, Dueck D. Clustering by Passing Messages Between Data Points, Science[EB/OL]. (2007-02). http: /www. psi. toronto. edu/affinitypropagation/FreyDueckScience07. pdf.

Google Scholar

[3] Kelly K. Affinity Program Slashes Computing Times[EB/OL]. (2007-02-15). http: /www. news. utoronto. ca/bin6/070215-2952. asp.

Google Scholar

[4] Guha S, Mishra N, Motwani R. Clustering data streams [C] . In: Proceedings of the Annual Symposium on Foundations of Computer Science, 2000: 359-366.

DOI: 10.1109/sfcs.2000.892124

Google Scholar

[5] LIU Min-juan, CHAI Yu-mei, ZHANG Xi-zhi. Similarity - based Grid Clustering Algorithm. Computer Engineering Applications, 2007, 43(7): 198- 201.

Google Scholar

[6] C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A framework for clustering evolving data streams. In Proc. VLDB, 2003 : 81–92.

DOI: 10.1016/b978-012722442-8/50016-1

Google Scholar

[7] Nam Hun Park, Won Suk Lee. Statistical Grid-based Clustering over Data Streams, SIGMOD Record, 2004 March Vol. 33, No. 1: 32-37.

DOI: 10.1145/974121.974127

Google Scholar

[8] GAO Yong-Mei HUANG Ya-Lou. A Grid and Density2based Clustering Algorithm for Processing Data Stream. Computer Science, 2008 Vol. 35 No. 2: 134-137.

Google Scholar

[9] NI Wei-wei, LU Jie-ping, CHEN Geng, SUN Zhi-hui. Efficient Data Stream Clustering Algorithm Based on k-Means Partitioning and Density, Journal of Chinese Computer Systems, 2007 Vol. 28 No. 1: 83-87.

Google Scholar

[10] WANG Kai-jun, LI Jian, ZHANG Jun-ying, TU Chong-yang. Semi-supervised Affinity Propagation Clustering. Computer Engineering, 2007, 33(23): 197-198, 201.

Google Scholar

[11] XIAO Yu, YU Jian. Semi-Supervised Clustering Based on Affinity Propagation Algorithm, Journal of Software, Vol. 19, No. 11, November 2008: 2803-2813.

DOI: 10.3724/sp.j.1001.2008.02803

Google Scholar

[12] Nam Hun Park, Won Suk Lee. Statistical Grid-based Clustering over Data Streams, SIGMOD Record, Vol. 33, No. 1, March (2004).

DOI: 10.1145/974121.974127

Google Scholar

[13] Liu YB, Cai JR, Yin J et al. Clustering text data streams. Journal of Computer Science and Technology 2008 Jan 23(1): 112-128.

Google Scholar

[14] SUN Yu-fen LU Yan-sheng. An Overview of Stream Data Mining, Computer Science, 2007 Vol. 34No. 1: 1-11.

Google Scholar

[15] YAN Xiao-long, SHEN Hong. Subspace clustering method for high dimensional data stream, Computer Applications, 2007 July Vol. 27 No. 7: 1680-1710.

Google Scholar

[16] WANG Kai-Jun, ZHANG Jun-Ying, LI Dan, ZHANG Xin-Na, GUO Tao. Adaptive Affinity Propagation Clustering. Acta Automatica Scinica, 2007, 33(12): 1242-1246.

Google Scholar

[17] Feng Yu, Damalie Oyana, Wen-Chi Hou and Michael Wainer. Approximate Clustering on Data Streams Using Discrete Cosine Transform, Journal of Information Processing Systems, Vol. 6, No. 1, March 2010: 67-78.

DOI: 10.3745/jips.2010.6.1.067

Google Scholar

[18] Liu YB, Cai JR, Yin J et al. Clustering text data streams. Journal of Computer Science and Technology, Jan. 2008 23(1): 112-128.

Google Scholar

[19] YU Xiang, YIN Gui-sheng. An incremental irregular grid algorithm for clustering data streams, Journal of Harbin Engineering University, V01. 29No. 8 Aug. 2008: 846-850.

Google Scholar