Mining Approximate Frequent Itemsets over Data Streams

Article Preview

Abstract:

This paper proposes a method based on Lossy Counting to mine frequent itemsets. Logarithmic tilted time window is adopted to emphasize the importance of recent data. Multilayer count queue framework is used to avoid the counter overflowing and query top-K itemsets quickly using a index table.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

536-539

Citation:

Online since:

October 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] M. Garofalakis. J. Gehrke and R. Rastogi. Querying andMining Data Streams: You only Get One Look. In the tutorial notes of VLDB, (2002).

DOI: 10.1145/564691.564794

Google Scholar

[2] Han J, Pei J, Yin Y, et al. Mining frequent patterns without candidate generation: A frequent pattern tree approach. Data Mining and Knowledge Discovery, 2004, 8(1): 53-87.

DOI: 10.1023/b:dami.0000005258.31418.83

Google Scholar

[3] Gibbons P B, Matias Y. Synopsis data structures for massive data sets /Proc of the 10th Annual ACM-SIAM Sympon Discrete Algorithms. New York: ACM/SIAM, 1999: 909-910.

DOI: 10.1090/dimacs/050/02

Google Scholar

[4] Cheung Y L, Fu A W C. Mining frequent itemsets without support threshold: With and without item constraints. IEEE Trans on Knowledge and Data Engineering, 2004, 16(9): 1052-1069.

DOI: 10.1109/tkde.2004.44

Google Scholar

[5] Babcock B, Olston C. Distributed top-Kmonitoring /Proc of the ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2003: 28-39.

DOI: 10.1145/872757.872764

Google Scholar

[6] Giannella C, Han J, Pei J, et al. Mining Frequent Patterns in Data Streams at Multiple Time Granularities∥Data Mining: Next Generation Challenges and Future Directions. Cambridge, Massachusetts: MIT/AAAI Press, 2004: 191-212.

Google Scholar

[7] Manku G S, Motwani R. Approximate frequency counts over streaming data /Proc of the 28th Int Conf on Very Large Data Bases. San Francisco, CA: Morgan, Kaufmann, 2002: 346-357.

DOI: 10.1016/b978-155860869-6/50038-x

Google Scholar

[8] Metwally A, Agrawal D, Abbadi A E. Efficient computation of frequent and top-k elements in data streams /Proc of the Int Conf on Data Theory. Berlin: Springer, 2005: 398-4l2.

DOI: 10.1007/978-3-540-30570-5_27

Google Scholar

[9] Li Hai-Feng, ZHANG Ning, ZHU Jian-Ming, CAO Huai-Hu. Frequent Itemset Mining over Time-sensitive Streams. Chinese Journal of Computers. 2012, vol 35, No. 11.

DOI: 10.3724/sp.j.1016.2012.02283

Google Scholar

[10] Deypir M, Sadreddini M H. An efficient algorithm for mining frequent itemsets with large window over data streams. International Journal of data Engineering, 2011, 2(3): 119-125.

DOI: 10.1109/iccke.2011.6413356

Google Scholar