Equivalence Class Based Parallel Algorithm for Mining MFI

Hui Wang

doi:10.4028/www.scientific.net/AMM.713-715.1712

Paper Titles

EL-SVDD: An Improved and Localized Multi-Class Classification Algorithm
p.1693

Energy Minimization Model Based Target Tracking
p.1699

Energy-Saving Scheduling Algorithm of Public Transport Bus Based on Passenger Flow Distribution in Different Time
p.1703

Environmental Quality Assessment Based on a New Kind of Neural Network Using Rational Spline Weight Functions
p.1708

Equivalence Class Based Parallel Algorithm for Mining MFI
p.1712

Error Analysis of Neural Network with Multiply Neurons Using Rational Spline Weight Functions
p.1716

Fast Inpainting Algorithm of Woodblock New Year Pictures Damaged Area Based on Combination of Texture Distribution Analysis and Anisotropic Diffusion Technique
p.1721

Global Stability of an Age Structured SEIRS Epidemic Model with Screening
p.1725

High-Rise Residential Complex Wind Aerodynamics Simulation
p.1729

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 713-715Equivalence Class Based Parallel Algorithm for...

Equivalence Class Based Parallel Algorithm for Mining MFI

Abstract:

We present a novel and powerful parallel algorithm, PMFI, for mining all the maximal frequent itemsets from a big database. PMFI utilizes novel technologies to make the I/O overhead down drastically. The key principle is to utilize prefix-based equivalence classes to decompose the search space. It distributes the work among the processors by equivalence class weights. It re-represents the database with vertical format, so the frequency counting can be done by simple tid-list intersection operations. It bases a novel serial algorithm MaxMining which utilizes multiple-level backtrack pruning strategy, so that each processor can count the maximal frequent itemsets independently by selectively duplicating the pieces of database. These techniques eliminate the need for synchronization. The dynamic load balance schema is applied in PMFI, it would be hopeful to achieve better performance.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 713-715)

Pages:

1712-1715

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.713-715.1712

Citation:

Cite this paper

Online since:

January 2015

Authors:

Hui Wang*

Keywords:

Big Data, Data Mining, Equivalence Class, MFI, Parallel Algorithm

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] Xingdong Wu, Gong-Qing Wu, Wei Ding. Data Mining with Big Data[J]. IEEE Transactions on Knowledge and Data Engineering, Vol. 26, No. 1, January 2014. pp.97-107.

DOI: 10.1109/tkde.2013.109

Google Scholar

[2] A. Rajaraman and J. Ullman, Mining of Massive Data Sets. Cambridge University Press, (2011).

Google Scholar

[3] A. Labrinidis and H. Jagadish, Challenges and Opportunities with Big Data, Proc. VLDB Endowment, vol. 5, no. 12, 2012, p.2032-(2033).

DOI: 10.14778/2367502.2367572

Google Scholar

[4] E. Schadt, The Changing Privacy Landscape in the Era of Big Data, Molecular Systems, Vol. 8, (2012).

Google Scholar

[5] C. Wang, S.S.M. Chow, Q. Wang, K. Ren, and W. Lou, Privacy-preserving Public Auditing for Secure Cloud Storage, IEEE Trans. Computers, vol. 62, no. 2 , Feb. 2013: pp.362-375.

DOI: 10.1109/tc.2011.245

Google Scholar

[6] Zdanowicz, J. Detecting money laundering and terrorist financing via data mining. Communications of the ACM, 2004, pp.53-55.

DOI: 10.1145/986213.986239

Google Scholar

[7] Stephen Lavelle, Fabricating Synthetic Data in Support of Training for Domestic Terrorist Activity Data Mining Research[D]. Naval Postgraduate School, Monterey, CA, (2010).

Google Scholar

[8] Office of the Director of National Intelligence. Data Mining Report. Office of the Director of National Intelligence, (2009).

Google Scholar

[9] L. Zeng, L. Li, L. Duan, K. Lu, Z. Shi, M. Wang, Distributed data mining: A survey. Information Technology and Management, 2012, pp.403-409.

Google Scholar

[10] A. Ghoting, P. Kambadur, E. Pednault, and R. Kannan. NIMBLE: A toolkit for the implementation of parallel data mining and machine learning algorithms on mapreduce. in: Proc. of 2011 ACM SIGKDD, pp.334-342.

DOI: 10.1145/2020408.2020464

Google Scholar

[11] Fumarola, Fabio, A parallel algorithm for approximate frequent itemset mining using MapReduce, in: proceedings of 2014 International Conference on High Performance Computing & Simulation (H PCS), Bologna, Italy 21-25 , July 2014 , pp.335-342.

DOI: 10.1109/hpcsim.2014.6903705

Google Scholar

[12] Farzanyar Z. , Cercone N. , Accelerating Frequent Itemsets Mining on the Cloud: A MapReduce -Based Approach, in: proceedings of 2013 IEEE 13th International Conference on Data Mining Workshops (ICDMW), pp.592-598.

DOI: 10.1109/icdmw.2013.106

Google Scholar