A Comparative Study of Tree-Based and Apriori-Based Approaches for Incremental Data Mining

Article Preview

Abstract:

Association rule mining is an iterative and interactive process of discovering valid, novel, useful, understandable and hidden associations from the massive database. The Colossal databases require powerful and intelligent tools for analysis and discovery of frequent patterns and association rules. Several researchers have proposed the many algorithms for generating item sets and association rules for discovery of frequent patterns, and minning of the association rules. These proposals are validated on static data. A dynamic database may introduce some new association rules, which may be interesting and helpful in taking better business decisions. In association rule mining, the validation of performance and cost of the existing algorithms on incremental data are less explored. Hence, there is a strong need of comprehensive study and in-depth analysis of the existing proposals of association rule mining. In this paper, the existing tree-based algorithms for incremental data mining are presented and compared on the baisis of number of scans, structure, size and type of database. It is concluded that the Can-Tree approach dominates the other algorithms such as FP-Tree, FUFP-Tree, FELINE Alorithm with CATS-Tree etc.This study also highlights some hot issues and future research directions. This study also points out that there is a strong need for devising an efficient and new algorithm for incremental data mining.

You might also be interested in these eBooks

Info:

Pages:

120-130

Citation:

Online since:

April 2016

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2016 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Tan, P.N., Steinbach, M., and Kumar, V., Introduction to data mining, Addison Wesley Publishers, (2006).

Google Scholar

[2] A. Kejariwal, Big Data Challenges : A program optimization Perspective, International Conference on cloud and Green Computing ( 2012) 202-207.

DOI: 10.1109/cgc.2012.17

Google Scholar

[3] R. Agrawal ,T. Imielinski and A. Swami, Mining Association Rules between sets of Items in large Databases". In Proc. 1993 ACM-SIGMOD, Int. Conf. Management of Data (SIGMOD, 93), pp.207-216 Washington , DC, May (1993).

DOI: 10.1145/170036.170072

Google Scholar

[4] V. Pudi. Data Mining: Concepts and Techniques, Oxford University Press, Jan-(2009).

Google Scholar

[5] G. Kaur, S. Aggarwal, Performance Analysis of Association Rule Mining Algorithms, International Journal of Advanced Research in Computer Science and Software Engineering, 3-8 (2013) 856-858.

Google Scholar

[6] S. Ghosh, S. Biswas , D. Sarkar ,P.P. Sarkar , Mining Frequent Itemsets Using Genetic Algorithm, International Journal of Artificial Intelligence & Applications, 1-4 (2010) 133-143.

DOI: 10.5121/ijaia.2010.1411

Google Scholar

[7] C. Chai, B. Li, A Novel Association Rules Method Based on Genetic Algorithm and Fuzzy Set Strategy for Web Mining, Journal of Computers, 5-9 ( 2010) 1448-1455.

DOI: 10.4304/jcp.5.9.1448-1455

Google Scholar

[8] W. Soto , A. Olaya-Benavides, A Genetic Algorithm for Discovery of Association Rules. In Computer Science Society (SCCC) ( 2011) 289-293.

DOI: 10.1109/sccc.2011.37

Google Scholar

[9] P. Ashthana, A. Singh, D. Singh, A Survey on Association Rule Mining Using Apriori Based Algorithm and Hash Based Methods, International Journal of Advanced Research in Computer Science and Software Engineering, 3-7( 2013) 599- 603.

Google Scholar

[10] D. W Cheung , J. Han, V. T. Ng , C. Y. Wong, Maintenance of discovered association rules in large database: An incremental updated approach. The 12th IEEE International Conference on Data emerging ( 1996) 106-114.

DOI: 10.1109/icde.1996.492094

Google Scholar

[11] D.W. Cheung, S.D. Lee , B. Kao, A general incremental technique for maintaining discovered association rules. In Proc of the fifth international conference on database system for advanced application, Melbourne, Australia ( 1997) 185-194.

DOI: 10.1142/9789812819536_0020

Google Scholar

[12] Z. Zhou , C. I. Ezeife,. A Low-Scan Incremental Association Rule Maintenance Method. Proceedings of the 14th Canadian Conference on Artificial Intelligence. (2001).

DOI: 10.1007/3-540-45153-6_3

Google Scholar

[13] S. Thomas, S. Badagola, K. A Isabh , S. Ranka, An efficient algorithm for updating large items with pruning. Proc of third ACM SIGKDD international conference on knowledge discovery on data mining ( 1997) 263-266.

Google Scholar

[14] N.L. Sarda, N.V. Srinivas, An Addaptive Algorithm for Incremental Mining of Association Rules. Proceedings of Ninth Int. Workshop on Database and Expert SysteM (1998) 240-248.

DOI: 10.1109/dexa.1998.707409

Google Scholar

[15] N.S. Ayan, A.U. Tansel , M. E Arkun, An efficient algorithm for updating large items with pruning. Proc of fifth ACM SIGKDD international conference on knowledge discovery on data mining ( 1999) 287-291.

DOI: 10.1145/312129.312252

Google Scholar

[16] A. Das, D.K. Bhattacharya, Rule Mining for Dynamic Databases, Australasian Journal of Infortion Systems, 13-1( 2005) 19-39.

Google Scholar

[17] R. Feldman, Y. Aumanm, O. Lipshtat, Border : An Efficient Algorithm for Association Generation in Dynamic Databases, Journal of Intelligent Information Systems (1999) 61-73.

Google Scholar

[18] Z. Meng, B. Shao, M. Jiang, An Algorithm of Dynamic Assocition Rule Based on Sliding indows, Proceeding of Int. Conference on Comptational Intelligence and Security ( 2010) 51-54.

Google Scholar

[19] W Cheung , O. R. Zaiane, Incremental Mining of Frequent Patterns without Candidate Generation or Support Constraint. Proceedings of the 7th International Database Engineering and Application Symposium ( 2003) 111- 116.

DOI: 10.1109/ideas.2003.1214917

Google Scholar

[20] Leung, Carson Kai-Sang, Quamrul I. Khan, Tariqul Hoque, CanTree: A Tree Structure for Efficient Incremental Mining of Frequent Patterns", in Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM'05), (2005).

DOI: 10.1109/icdm.2005.38

Google Scholar

[21] T. P. Hong, J. W. Lin, Y. L. Wu, A Fast Updated Frequent Pattern Tree, in Proc of IEEE International Conference on Systems, Man, and Cybernetics, Taipei, Taiwan ( 2006) 2167-2172.

DOI: 10.1109/icsmc.2006.385182

Google Scholar

[22] M. Adnan, R. Alhaji, K. Barker, Alternative Method for Incrementally Constructing te FP-Tree, Proceeding of International IEEE Conference Intelligent Systems. (2006) 494-499.

DOI: 10.1109/is.2006.348469

Google Scholar

[23] C. Lin, T.P. Hong, W.H. Lu, The Pre-FUFP algorithm for Incremental Mining, Exper System with Applications, doi10. 1016/j. eswa. 2008. 03. 014 ( 2008).

Google Scholar

[24] W. Jian, L.X. Ming, A novel Algorithm for dynamic Mining of Association Rules, Proceeding of Workshop on Knowledge Discovery and Data Mining ( 2008) 94-99.

DOI: 10.1109/wkdd.2008.32

Google Scholar

[25] S. Shan, X. Wang, M, Sui, Mining Association Rules: A Continuous Incremental Updating Technique, Proceeding of International Conference on Web Information Systems and Mining ( 2010) 62-66.

DOI: 10.1109/wism.2010.39

Google Scholar

[26] X. Wei, Y. Ma, F. Zhang, M. Liu, W. Shen, Incremental FP-Growth Mining Strategy for Dynamic Threshold Value and Database Based on MapReduce, Proceeding of IEEE 18th International Conference on Computer Supported Cooperative work in Design ( 2014) 271- 276.

DOI: 10.1109/cscwd.2014.6846854

Google Scholar

[27] S. Kurazumi, T. Tsumura, S. Saito, H. Matsuo, Dynamic Processing slots scheduling for I/O intensive jobs of Hadoop MapReduce, Proceeding of the 3rd International Conference on Networking and Computing ( 2012) 288-292.

DOI: 10.1109/icnc.2012.53

Google Scholar

[28] J.S. Park, M.S. Chan and P.S. Yu. An effective hash-based algorithm for mining association rules". In Proc. 1995 ACM-SIGMOD Int. Conf. management of Data (SIGMOD, 95), pp.175-186, San Jose, CA, May (1995).

DOI: 10.1145/568271.223813

Google Scholar

[29] J. Han, J. Pei and Y Yin. Mining frequent patterns without candidate generation". In Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data ( SIGMOD, 00), pp.1-12, Dallas, TX, May (2000).

DOI: 10.1145/335191.335372

Google Scholar

[30] B. J. Park, Efficient Tree-based Discovery of Frequent Itemsets, International Journal of Multimedia and Ubiquitous Engineering Vol. 7, No. 2, pp.383-388, April, (2012).

Google Scholar

[31] S. Patel , K. Kotecha, Incremental Frequent Pattern Mining using Graph based approach, International Journal of Computers & Technology, 4-2 ( 2013) 731-736.

DOI: 10.24297/ijct.v4i2c2.4191

Google Scholar