Improved BVBUC Algorithm to Discover Closed Itemsets in Long Biological Datasets

Article Preview

Abstract:

The task in mining closed frequent itemsets requires the algorithm to mine the frequent ones then determine its closure. The efficiency of closure computation is very important as it will determine the total mining time and the required memory. Over the years, many closure computation methods have been proposed to achieve these goals. However, to the best of our knowledge, there is no suitable method that can be adapted for algorithms that enumerate the rowset lattice, which is effective for biological datasets. Therefore, this paper proposed a method for computing closure compare with the method used in BVBUC algorithm method. Finally, BVBUC_I is proposed and the performances of these algorithms were evaluated using two synthetic datasets and three real datasets. The results of these tests proved the efficiency of the proposed method.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

157-167

Citation:

Online since:

June 2019

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2019 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in largedatabases. In: Proceedings of the 1993ACM-SIGMOD international conference on managementDFGHJL'of data (SIGMOD'93), Washington, DC, p.207–216.

DOI: 10.1145/170035.170072

Google Scholar

[2] Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceeding of the 7th international conference on database theory (ICDT'99), Jerusalem, Israel, p.398–416.

DOI: 10.1007/3-540-49257-7_25

Google Scholar

[3] Han, J., Pei, J., and Yin, Y., (2000). Mining frequent patterns without candidate generation. Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.1–12.

DOI: 10.1145/342009.335372

Google Scholar

[4] Wang J, Han J, Pei J (2003) CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proceeding of the 2003 ACM SIGKDD international conference on knowledge discovery and data mining (KDD'03), Washington, DC, p.236–245.

DOI: 10.1145/956750.956779

Google Scholar

[5] Grahne G, Zhu J (2003)Efficiently using prefix-trees in mining frequent itemsets. In: Proceeding of the ICDM'03 international workshop on frequent itemset mining implementations (FIMI'03), Melbourne, FL, p.123–132.

DOI: 10.1109/icdm.2004.10116

Google Scholar

[6] Liu, G., Lu, H., Yu, J. X., Wang, W., and Xiao, X., (2003). AFOPT: An efficient implementation of pattern growth approach. Proceedings of the ICDM workshop.

Google Scholar

[7] Christian Borgelt, Xiaoyuan Yang, Ruben Nogales-Cadenas, Pedro Carmona-Saez, and Alberto Pascual-Montano. Proc. 14th Int. Conf. on Extending Database Technology (EDBT 2011, Uppsala,Sweden),367-376. ACM Press, New York, NY, USA (2011).

DOI: 10.1145/1951365.1951410

Google Scholar

[8] Zhu F, Yan X, Han J, et al. Mining colossal frequent patterns by core pattern fusion. In: Proceedings of the IEEE 23rd International Conference on Data Engineering. Istanbul, Turkey: ACM Press, 2007; 706–15.

DOI: 10.1109/icde.2007.367916

Google Scholar

[9] Madhavi Dabbiru, Moghalla Shashi, An efficient approach to colossal pattern mining, Int. J. Comput. Sci. Network Secur. (IJCSNS) 6 (2010) 304–312.

Google Scholar

[10] K. Prasanna, M. Seetha, Efficient and Accurate Discovery of Colossal Pattern Sequences from Biological Datasets: A Doubleton Pattern Mining Strategy (DPMine), In Procedia Computer Science, Volume 54, 2015, Pages 412-421, ISSN 1877-0509, https://doi.org/10.1016/j.procs.2015.06.048.

DOI: 10.1016/j.procs.2015.06.048

Google Scholar

[11] Mohammad Karim Sohrabi, Ahmad Abdollahzadeh Barforoush, Efficient colossal pattern mining in high dimensional datasets, In Knowledge-Based Systems, Volume 33, 2012, Pages 41-52, ISSN 0950-7051, https://doi.org/10.1016/j.knosys.2012.03.003.

DOI: 10.1016/j.knosys.2012.03.003

Google Scholar

[12] Zulkurnain, N. F., and Keane, J. A., (2012). DisClose: Discovering Colossal Closed Itemsets via a Memory Efficient Compact Row-Tree. Proceedings of the 2nd Doctoral Symposium on Data Mining (DSDM'12), in Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2012), p.41–52.

DOI: 10.1007/978-3-642-36778-6_12

Google Scholar

[13] Thanh-Long Nguyen, Bay Vo, Vaclav Snasel, Efficient algorithms for mining colossal patterns in high dimensional databases, In Knowledge-Based Systems, Volume 122, 2017, Pages 75-89, ISSN 0950-7051, https://doi.org/10.1016/j.knosys.2017.01.034.

DOI: 10.1016/j.knosys.2017.01.034

Google Scholar

[14] C. Lucchese, S. Orlando and R. Perego, Fast and memory efficient mining of frequent closed itemsets,, in IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 1, pp.21-36, Jan. (2006).

DOI: 10.1109/tkde.2006.10

Google Scholar

[15] Fournier-Viger, P., Lin, C.W., Gomariz, A., Gueniche, T., Soltani, A., Deng, Z., Lam, H. T. (2016). The SPMF Open-Source Data Mining Library Version 2. Proc. 19th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD 2016) Part III, Springer LNCS 9853, pp.36-40.

DOI: 10.1007/978-3-319-46131-1_8

Google Scholar

[16] Hanchuan Peng, Fuhui Long, and Chris Ding, Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy,, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp.1226-1238, (2005).

DOI: 10.1109/tpami.2005.159

Google Scholar

[17] Zaki MJ, Hsiao CJ (2002) CHARM: an efficient algorithm for closed itemset mining. In: Proceeding of the 2002SIAMinternational conference on data mining (SDM'02),Arlington,VA, p.457–473.

DOI: 10.1137/1.9781611972726.27

Google Scholar