Maximal Frequent Item Sequences Mining of Datasets with few Attributes and Large Instances

Li Juan Zhou; Zhang Zhang; Shuang Li

doi:10.4028/www.scientific.net/AMM.44-47.3304

Paper Titles

The Attribute Selection Method Based on PSO
p.3284

Intelligent Control of Flocculation Process Based on Radial Basis Probabilistic Neural Network for Sewage Treatment
p.3289

DEHCA: Load Balance Clustering Algorithm for Energy Heterogeneous WSN Based on Distance
p.3294

Integrated Solving Strategy for Cloud Computing
p.3299

Maximal Frequent Item Sequences Mining of Datasets with few Attributes and Large Instances
p.3304

An Object-Oriented Class Library for Scanning Path Generation in SLS/SLM Process
p.3309

Research on Laser Cleaning of Ultra Precision Machining Hard-Brittle Workpieces
p.3314

Implementation of P2P Traffic Identification System
p.3318

The Simulation of Resonance Phenomenon of Super-Heavy Vibrating Screen
p.3322

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 44-47Maximal Frequent Item Sequences Mining of Datasets...

Maximal Frequent Item Sequences Mining of Datasets with few Attributes and Large Instances

Abstract:

This work proposes a new fast algorithm named MFISS-FG (maximal frequent item sequence sets fast generating) finding maximal frequent item sequences from relational database. It adapts to datasets with few attributes and large instances. Itemset is defined as IS (item sequence) for mining. Two lists called ISL (Item Sequence List) and FISL (Frequent Item Sequence List) are created by scanning database once for dividing n-IS into two categories depending on whether the IS to achieve minimum support number (n is the number of attributes). SIS (Sub item sequences) whose n-superset is in ISL are generated by recursion to make sure that each k-SIS appeared before its (k+1)-superset (k range from 1 to n-1). As current k-SIS being joined to FISL, its (k-1)-SIS are pruned (k range from 2 to n-1). At last, all SISs whose n-superset is in FISL are pruned from FISL to hold all maximal frequent item sequences. We compare our MFISS-FG and FP-Growth by a set of time-consuming experiments to prove the superiority of MFISS-FG both not only with increasing datasets but also with changing mini-support.

You might also be interested in these eBooks

Frontiers of Manufacturing and Design Science

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 44-47)

Pages:

3304-3308

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.44-47.3304

Citation:

Cite this paper

Online since:

December 2010

Authors:

Li Juan Zhou, Zhang Zhang, Shuang Li

Keywords:

Data Mining (DM), Maximal Frequent Item Sequence, Relational Database

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Jin R, Agrawal G. An algorithm for in-core frequent itemset mining on streaming data. In Proceeding of the 2005 international conference on data mining (ICDM'05), Houston, 2005, TX, p.210–217.

DOI: 10.1109/icdm.2005.21

Google Scholar

[2] Wu Xindong, Vipin Kumar, J. Ross Quinlan, Joydeep Ghosh, Yang Qiang, Hiroshi Motoda, Geoffrey J. McLachlan, Angus Ng, Liu Bing, Philip S. Yu, Zhou Zhihua, Michael Steinbach, David J. Hand, Dan Steinberg. Top 10 algorithms in data mining, Knowl Inf Syst (2008).

DOI: 10.1007/s10115-007-0114-2

Google Scholar

[3] Agrawal Ret al. Fast algorithms for mining association rules. In: Proc the 20th International Conference on Very Large Data Bases. Santiago de Chile, 1994. 478-499.

Google Scholar

[4] Nicolas Pasquieret al. Efficient mining if association rules using closed itermset lattices. Information Systems, 1999, 24(1): 25-46.

Google Scholar

[5] Agrawal Ret al. Mining association rule between sets of items in large database. In: Proc the ACM SIGMOD International Conference on Management of Data, Washington, 1993. 207-216.

DOI: 10.1145/170036.170072

Google Scholar

[6] Cheung Det al. Efficient mining of association rules in distribut-ed databases. IEEE Trans Knowledge and Data Engineering, 1996, 8(6): 911-922.

DOI: 10.1109/69.553158

Google Scholar

[7] Chen M Set al. Data mining: An overview from a database per-spective. IEEE Trans Knowledge and Data Engineering, 1996, 8(6): 866-883.

Google Scholar

[8] Han J W, Yin Y W. Mining frequent patterns without candidate generation. In: Proc SIGMOD Conference, 2000, 1-12.

DOI: 10.1145/335191.335372

Google Scholar

[9] Cheng J Het al. Multi-strategy approach to mining interesting rules. Chinese Journal of Computers, 2000, 23(1): 47-51(in Chinese).

Google Scholar

[10] Zhu Jiaxian. A Mining Algorithm of Association Rule Based on Linked List,. Journal of Shaoxing University, 2004, 8(24): 19-22, 59.

Google Scholar

[11] Bao Zhengyi, Wang Zhoujing. MLCI Algorithm for Mining Lower Closed Itemsets, Computing Technology and Automation, 2005. 12, 4(24): 73-76.

Google Scholar

[12] Yang Qiang, Wu Xindong. 10 Challenging Problems in Data Mining Research, International Journal of Information Technology & Decision Making, 2006, 4(5): 597-604.

DOI: 10.1142/s0219622006002258

Google Scholar

[13] Han Wang, Lingfu Kong, A Constrained Maximum Frequent Itemsets Incremental Mining Algorithm, Network and Parallel Computing Workshops, IFIP International Conference on, pp.743-747, 2007 IFIP International Conference on Network and Parallel Computing Workshops (NPC 2007), (2007).

DOI: 10.1109/npc.2007.110

Google Scholar

[14] Mao Guojun, Liu Chunnian. Mining of Association Rules Based on the Operators of Set of Item Sequences, Chinese Journal of Computers, 2002, 25(4): 417-422(in Chinese).

Google Scholar