Paper Titles

Research of Remote Access System to the Digital Resources Based on Cloud Computing
p.3606

Improved RSA Algorithm in Hardware Encryption
p.3610

An Improved Naive Bayesian Classification Algorithm for Sentiment Classification of Microblogs
p.3614

Algorithms Randomly Extracting Questions Based on Paperless Network Test
p.3621

Research on Parallel Association Rules Mining Algorithm Based on Hadoop
p.3625

Shamir's Threshold Scheme to Ensure Security in Cloud Computing Services
p.3632

Research of Edge Centrality Based on the Algebraic Connectivity
p.3636

The Grid Task Attemper Layer Model Based on Agent Role
p.3641

Generalization Privacy Protection Method for Alarm Data
p.3646

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 543-547Research on Parallel Association Rules Mining...

Research on Parallel Association Rules Mining Algorithm Based on Hadoop

Article Preview

Abstract:

The purpose of association rules mining is to find rules which can meet the minimum support and minimum confidence from a large quantity of data. To find the valid association rules efficiently, we had a comprehensive analysis on some well-know parallel association rules mining algorithms and proposes a new parallel association rules mining algorithm (Array Based on Hadoop, short for ABH) based on the cloud computing platform. The ABH scans the database only once, uses the 0/1 array to represent one of the transactions and to record the frequency of the same transaction. Moreover, by utilizing the random access characteristics of the array and the special nature of the frequent itemset, the ABH can reduce the quantity of frequent candidate itemset effectively and find the frequent itemset quickly. We have compared the ABH with two classical algorithms CD and DD through experiment; we can find that ABH outperforms CD and DD.

You might also be interested in these eBooks

Vehicle, Mechatronics and Information Technologies II

Info:

Periodical:

Applied Mechanics and Materials (Volumes 543-547)

Pages:

3625-3631

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.543-547.3625

Citation:

Cite this paper

Online since:

March 2014

Authors:

Shao Rong Feng*, Lin Bao Ye, Zi Yu Lin

Keywords:

Association Rule, Cloud Computing Platform, Data Mining (DM), Parallel Algorithm, Transaction Array

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Zhongzhi Shi. Knowledge Discovery(the second edition)[M]. Beijing: TsingHua University Press. 2011, 1, pages 140-183.

[2] Agrawal R, Imieliski T, Swami A. 1993. Mining association rules between sets of items in large database. In: Proceedings of ACM SIGMOD International conference on Management of Data (SIGMOD'93), 207-216.

DOI: 10.1145/170035.170072

[3] Agrawal R, Shafer J.C. 1996. Parallel mining of association rules: Design, Implementation and Experience. Special Issue in Data Mining, IEEE Trans, on Knowledge and Data Engineering, IEEE Computer Society, 8(6): 962-969.

DOI: 10.1109/69.553164

[4] J.S. Park, M.S. Chen, P.S. Yu. Using a hash-based method with transaction trimming for mining association rules, IEEE Transactions on knowledge and data engineering, 1997, 9(5), 813-825.

DOI: 10.1109/69.634757

[5] Han E H, Kaprypis G, Kumar V. 1997. Scalable parallel data mining for association rules[C]. Proceedings of ACM SIGMOD International Conference on Management of Data(SIGMOD'97), Tucson: ACM Press, Pages 277-288.

DOI: 10.1145/253260.253330

[6] Han E H, Karypis G, Kumar V. Scalable parallel data mining for association rules[M]. ACM, (1997).

[7] Zaiane O R, EI-Hajj M, Lu P. Fast Parallel Association Rule Mining Without Candidate Generation[M]. Technical Report TROI-12, Department of Computing Science, University of Alberta, Canada, (2001).

DOI: 10.1109/icdm.2001.989600

[8] Cheung D W, Jiawei Han, Ng V T, etal. A Fast Distributed Algorithm for Mining Association Rules[C]. Proceedings of IEEE 4th International Conference Parallel and Distributed Information Systems. Miami Beach, Florida, 1996, 31-44.

DOI: 10.1109/pdis.1996.568665

[9] Cheung, D., Xiao, Y. Effect of data skewness in parallel mining of association rules[J], Lecture Notes in Computer Science, Volume 1394, Aug 1998, Pages 48-60.

DOI: 10.1007/3-540-64383-4_5

[10] Manning, A., Keane, J., Data Allocation Algorithm for Parallel Association Rule Discovery[J]. Lecture Notes in Computer Science, Volume 2035, Page 413-420.

DOI: 10.1007/3-540-45357-1_44

[11] Apache Hadoop. Hadoop [EB/OL]. http: /hadoop. apache. org.

DOI: 10.1002/9781119281320.ch7

[12] Apache HDFS. HDFS [EB/OL]. http: /hadoop. apache. org/hdfs.

DOI: 10.1007/978-1-4842-2424-3_2

[13] Apache MapReduce. MapReduce[EB/OL]. http: /hadoop. apache. org/mapreduce.

[14] Dean J, Ghemawat S. MapReduce: Simplied data processing on large clusters[C]. OSD I'04: Proceedings of the 6th Symposium on Operating System Design and Implementation. New Work: ACM Press, 2004: 137-150.

[15] Jiawei Han. Micheline Kamber creation. FanMing, Xiaofeng Meng translation. Data Mining: Concepts and Techniques[M]. Beijing: China Machine Press, 2007, 3, Pages 146-183.