Research on Dynamic Data Streams Clustering Algorithm –Pdstream Based on PCA and Density

Mei Zheng; Chun Hua Ju; Zhang Rui

doi:10.4028/www.scientific.net/AMM.26-28.108

Paper Titles

Chaotic Characteristic and Nonlinear Dynamic Performance of Rolling Bearing Friction Torque
p.88

The Deformation Analysis about Guide Way Contact of Large Span and Heavy Load Cross-Rail
p.93

Time Series Data Mining Implemented on Football Match
p.98

Study on Structure and Property of PAN//MWNTs Composite Fibers
p.104

Research on Dynamic Data Streams Clustering Algorithm –Pdstream Based on PCA and Density
p.108

Maximal Frequent Itemsets in Data Stream Mining Based on Orderly-Compound Policy
p.113

New Policy of Maximal Frequent Itemsets in Data Stream Mining
p.118

Electrical and Structural Analysis of ATO/PP Nanocomposites
p.123

Characterization of Poly (Vinyl Alcohol) /Silver Nanocomposites Prepared by Heat Treatment Method
p.127

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 26-28Research on Dynamic Data Streams Clustering...

Research on Dynamic Data Streams Clustering Algorithm –Pdstream Based on PCA and Density

Abstract:

The research on data streams clustering has become a focus in the field of data streams mining. Because the number of data streams is too large, and CPU of the computer has limited memory and time, it’s difficult to carry out clustering quickly and effectively. For that problem, we design an improved clustering algorithm for dynamic data streams based on principal component analysis and density. The PDStream algorithm effectively overcomes the shortcomings of the STREAM algorithm controlled by historical data and the CluStream algorithm is difficult to describe non-spherical and out "old data", resulting in huge amount of data. In the course of the experiment, we compare with the STREAM algorithm, the PDStream algorithm shows the superiority of handling mass data and the characteristics of high-quality clustering.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 26-28)

Pages:

108-112

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.26-28.108

Citation:

Cite this paper

Online since:

June 2010

Authors:

Mei Zheng, Chun Hua Ju, Zhang Rui

Keywords:

Data Stream, Density, Principal Component Analysis (PCA), Sliding Window

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Muthukrishnan S. Data streams algorithms and applications[C] / Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms. Philadelphia: Society for Industrial and applied Mathematics，2003： 413-423.

Google Scholar

[2] Guha S, Koudas N. Approximating a data stream for querying and estimation: algorithms and performance evaluation[A]. In: Proceedings of the 18th International Conference on Data Engineering(ICDE)[C]. San Jose, California, USA: IEEE Press, 2002. 567-576.

DOI: 10.1109/icde.2002.994775

Google Scholar

[3] Domingo's P, Hulten C. Mining high-speed data streams. In: Proc. of the KDD. 2000. http: /citeseer. ist. psu. edu/domingos00mining.

Google Scholar

[4] Aggarwal CC, Han J, Wang J, Yu PS. A framework for projected clustering of high dimensional data streams. In: Nascimento MA, Özsu MT, Kossmann D, Miller RJ, Blakeley JA, Schiefer KB, eds. Proc. of the VLDB. Toronto: Morgan Kaufmann Publishers, 2004. 852−863.

DOI: 10.1016/b978-012088469-8.50075-9

Google Scholar

[5] CHANG Jian-Long, CAO Feng, ZHOU Ao-Ying. Clustering evolving data streams over sliding windows[J], Journal of Software , 2007, 18(4): 905-918.

DOI: 10.1360/jos180905

Google Scholar

[6] Guha S, Mishra N, Motwani R, O'Callaghan L. Clustering data streams. In: FOCS 2000. 359-366.

Google Scholar

[7] Aggarwal C, Han J, Wang J, et al. A framework for clustering evolving data streams[A]. In: proceedings of the 29th International Conference on Very Large Databases[C]. Berlin, Germany: Morgan Kaufmann Publishers, 2003. 81-92.

Google Scholar