Frequent Pattern Mining Based on Pattern Space Division in Map/Reduce Cluster

Qian Liu; Ming Chen

doi:10.4028/www.scientific.net/AMR.588-589.2038

Paper Titles

A Case Study of RFID-Enabled Payment Device
p.2021

Design of Photonic Crystal Fibers Based Polarization Splitter with Hollow Ring Defects
p.2026

Analysis to the Whiplash Effect in the Engineering Structure
p.2030

Stability Analysis of Cd Contaminated Soil in Dabizhuang
p.2034

Frequent Pattern Mining Based on Pattern Space Division in Map/Reduce Cluster
p.2038

Research on the Relationship among Technological Progress, Rebound Effect and Energy Efficiency
p.2042

Construction of Regulatory Boolean Networks Based on Expression Profiles Data Noise
p.2046

The Global Attractor of the 3D Complex GL Equation and the Estimate of its Dimensionality
p.2051

A Hybrid Treatment System Combining Enforced Diatomite Process Followed by Biological Aerated Filters for Wastewater Treatment
p.2055

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 588-589Frequent Pattern Mining Based on Pattern Space...

Frequent Pattern Mining Based on Pattern Space Division in Map/Reduce Cluster

Abstract:

By means of pattern space division and based on Map/Reduce, the problem of processing the many-to-many corresponding relationship between the data set and the patterns set is converted to the problem of processing the many-to-many corresponding relationship between the data subsets and the pattern subspaces associated with the frequent 1-itemsets. Thus, the scale of the intermediate key/value pairs set is reduced so dramatically that the problem of single Map node bottleneck which results from combinatorial explosion of candidate patterns space is avoided. Over three rounds of Map/Reduce tasks, the pattern space is constructed and divided, the filtering rules is established and employed, father more, the mining of frequent patterns is realized in each pattern subspace independently. By making the best of both the universal trait of the entire pattern space and the individuality of each pattern subspace, the optimized non-recursive algorithm is designed and implemented to improve the efficiency of mining phase.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Advanced Materials Research (Volumes 588-589)

Pages:

2038-2041

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.588-589.2038

Citation:

Cite this paper

Online since:

November 2012

Authors:

Qian Liu, Ming Chen

Keywords:

Cloud Computing, Data Mining (DM), Frequent Pattern, MapReduce, Pattern Space Division

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The Google file system. In 19th Symposium on Operating Systems Principles, pages 29. 43, Lake George, New York, 2003. To appear in OSDI 2004 12.

DOI: 10.1145/945445.945450

Google Scholar

[2] Dean, Jeffrey; Ghemawat, Sanjay. Map/Reduce: Simplified Data Processing on Large Clusters. Communications of the ACM, 2008, 51 (1): p.107 ~ 113.

DOI: 10.1145/1327452.1327492

Google Scholar

[3] Google Inc. Protocol Buffers: Google's data interchange format. (2010) <http: /code. google. com/p/protobuf/> Accessed 26. 01. 10.

Google Scholar

[4] McCreadie, R., et al. Map/Reduce indexing strategies: Studying scalability and efficiency. Information processing and Management (2011), doi: 10. 1016/j. ipm. 2010. 12. 003.

Google Scholar