Maximal Frequent Item Sequences Mining of Datasets with few Attributes and Large Instances

Article Preview

Abstract:

This work proposes a new fast algorithm named MFISS-FG (maximal frequent item sequence sets fast generating) finding maximal frequent item sequences from relational database. It adapts to datasets with few attributes and large instances. Itemset is defined as IS (item sequence) for mining. Two lists called ISL (Item Sequence List) and FISL (Frequent Item Sequence List) are created by scanning database once for dividing n-IS into two categories depending on whether the IS to achieve minimum support number (n is the number of attributes). SIS (Sub item sequences) whose n-superset is in ISL are generated by recursion to make sure that each k-SIS appeared before its (k+1)-superset (k range from 1 to n-1). As current k-SIS being joined to FISL, its (k-1)-SIS are pruned (k range from 2 to n-1). At last, all SISs whose n-superset is in FISL are pruned from FISL to hold all maximal frequent item sequences. We compare our MFISS-FG and FP-Growth by a set of time-consuming experiments to prove the superiority of MFISS-FG both not only with increasing datasets but also with changing mini-support.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

3304-3308

Citation:

Online since:

December 2010

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2011 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Jin R, Agrawal G. An algorithm for in-core frequent itemset mining on streaming data. In Proceeding of the 2005 international conference on data mining (ICDM'05), Houston, 2005, TX, p.210–217.

DOI: 10.1109/icdm.2005.21

Google Scholar

[2] Wu Xindong, Vipin Kumar, J. Ross Quinlan, Joydeep Ghosh, Yang Qiang, Hiroshi Motoda, Geoffrey J. McLachlan, Angus Ng, Liu Bing, Philip S. Yu, Zhou Zhihua, Michael Steinbach, David J. Hand, Dan Steinberg. Top 10 algorithms in data mining, Knowl Inf Syst (2008).

DOI: 10.1007/s10115-007-0114-2

Google Scholar

[3] Agrawal Ret al. Fast algorithms for mining association rules. In: Proc the 20th International Conference on Very Large Data Bases. Santiago de Chile, 1994. 478-499.

Google Scholar

[4] Nicolas Pasquieret al. Efficient mining if association rules using closed itermset lattices. Information Systems, 1999, 24(1): 25-46.

Google Scholar

[5] Agrawal Ret al. Mining association rule between sets of items in large database. In: Proc the ACM SIGMOD International Conference on Management of Data, Washington, 1993. 207-216.

DOI: 10.1145/170036.170072

Google Scholar

[6] Cheung Det al. Efficient mining of association rules in distribut-ed databases. IEEE Trans Knowledge and Data Engineering, 1996, 8(6): 911-922.

DOI: 10.1109/69.553158

Google Scholar

[7] Chen M Set al. Data mining: An overview from a database per-spective. IEEE Trans Knowledge and Data Engineering, 1996, 8(6): 866-883.

Google Scholar

[8] Han J W, Yin Y W. Mining frequent patterns without candidate generation. In: Proc SIGMOD Conference, 2000, 1-12.

DOI: 10.1145/335191.335372

Google Scholar

[9] Cheng J Het al. Multi-strategy approach to mining interesting rules. Chinese Journal of Computers, 2000, 23(1): 47-51(in Chinese).

Google Scholar

[10] Zhu Jiaxian. A Mining Algorithm of Association Rule Based on Linked List,. Journal of Shaoxing University, 2004, 8(24): 19-22, 59.

Google Scholar

[11] Bao Zhengyi, Wang Zhoujing. MLCI Algorithm for Mining Lower Closed Itemsets, Computing Technology and Automation, 2005. 12, 4(24): 73-76.

Google Scholar

[12] Yang Qiang, Wu Xindong. 10 Challenging Problems in Data Mining Research, International Journal of Information Technology & Decision Making, 2006, 4(5): 597-604.

DOI: 10.1142/s0219622006002258

Google Scholar

[13] Han Wang, Lingfu Kong, A Constrained Maximum Frequent Itemsets Incremental Mining Algorithm, Network and Parallel Computing Workshops, IFIP International Conference on, pp.743-747, 2007 IFIP International Conference on Network and Parallel Computing Workshops (NPC 2007), (2007).

DOI: 10.1109/npc.2007.110

Google Scholar

[14] Mao Guojun, Liu Chunnian. Mining of Association Rules Based on the Operators of Set of Item Sequences, Chinese Journal of Computers, 2002, 25(4): 417-422(in Chinese).

Google Scholar