An Algorithm for Mining Frequent Stream Data Items Using Hash Function and Fading Factor

Article Preview

Abstract:

A new algorithm to mine the frequent items in data stream is presented. The algorithm adopts a time fading factor to emphasize the importance of the relatively newer data, and records the densities of the data items in Hash tables. For a given threshold of density S and an integer k, our algorithm can mine the top k frequent items. Computation time for processing each data item is O(1) . Experimental results show that the algorithm outperforms other methods in terms of accuracy, memory requirement, and processing speed.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2661-2665

Citation:

Online since:

October 2011

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] G. S. Manku, R. Motwani. Approximate Frequency Counts over Data Streams[C]. In Proc. of VLDB, 2002: 346~357.

DOI: 10.1016/b978-155860869-6/50038-x

Google Scholar

[2] H. Liu, Y. Liu, J. Han, et al. Error-adaptive and time-aware maintenance of frequency counts over data streams[C], Proceeding of WAIN, 2006: 484~495.

DOI: 10.1007/11775300_41

Google Scholar

[3] T. Calders, N. Dexters, B. Goethals. Mining Frequent Itemsets in a Stream[C]. In Proceedings of 7th IEEE International Conference on Data Mining, 2007: 83~92.

DOI: 10.1109/icdm.2007.66

Google Scholar

[4] B. Lin, W. S. Ho, B. Kao, et al. Adaptive frequency counting over bursty data streams[C], Proceedings of the 2007 IEEE Symposium on Computational Intelligence and data mining, 2007: 516~523.

DOI: 10.1109/cidm.2007.368918

Google Scholar

[5] W. P. Wang, J. Z. Li, D. D. Zhang, et al. An efficient algorithm for mining approximate frequent item over data streams[J], Journal of Software, 2007, 18(4): 884~892.

DOI: 10.1360/jos180884

Google Scholar

[6] Frequent itemset mining dataset repository, University of Helsinki (2008), http: /fimi. cs. helsinki. fi/data.

Google Scholar

[7] L. F. Zhang, Y. Guan, Frequency estimation over sliding windows[C], Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, 2008: 1385~1387.

DOI: 10.1109/icde.2008.4497564

Google Scholar

[8] H. F. Li, S. Y. Lee. Mining frequent itemsets over data streams using efficient window sliding techniques[J]. Expert Systems with Applications. 2009, 36(2): 1466-1477.

DOI: 10.1016/j.eswa.2007.11.061

Google Scholar

[9] H. J. Woo, W. S. Lee, estMax: Tracing maximal frequent item sets instantly over online transactional data streams, Digital Object Identifier no. 10. 1109/TKDE. 2008. 233.

DOI: 10.1109/tkde.2008.233

Google Scholar

[10] I. T. Ferry , M. Nishad, P. Themis . Efficiently Discovering Recent Frequent Items in Data Streams[J]. Lecture Notes in Computer Science. 2008, 5069: 222~239.

DOI: 10.1007/978-3-540-69497-7_16

Google Scholar

[11] L. Chen, S. Zhang, L. Tu, An Algorithm for Mining Frequent Items on Data Stream Using Fading Factor[C], Proceedings of 2009 33rd Annual IEEE International Computer Software and Applications Conference, Seattle, pp.172-177.

DOI: 10.1109/compsac.2009.130

Google Scholar

[12] S. Zhang, L. Chen, L. Tu, Frequent Items Mining on Data Stream Based on Time Fading Factor[C], Proceedings of 2009 International Conference on Artificial Intelligence and Computational Intelligence, 2009, pp.336-340.

DOI: 10.1109/aici.2009.369

Google Scholar