Treatment and Research of Massive Data Mining Based on Cloud Computing

Article Preview

Abstract:

This paper introduces SPRINT algorithm optimized in the Hadoop core framework. Combing the data mining process, we will study the cloud computing in the MapReduce programming model, then improve and optimize the SPRINT algorithm in conjunction with the mode, transplant the optimized algorithm to Hadoop platform for distributed data processing.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 765-767)

Pages:

941-944

Citation:

Online since:

September 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Liangxiao Jiang and Zhihua Cai, Distributed Data Mining Research. Computer and Modernization, 2002, 85(9): 4~7(In Chinese).

Google Scholar

[2] Guojun Mao, Lijuan Duan and Shi Wang, Data Mining Principles and Algorithms, BeiJing: QsingHua University Press, 2005, 109~126(In Chinese).

Google Scholar

[3] Naohiro Ishii,Takahiro Yamada,Yongguang Bao. Rough Set Based Learning for Classification. 20th IEEE International Conference on Tools with Artificial Intelligence, 2008: 97-104.

DOI: 10.1109/ictai.2008.40

Google Scholar

[4] Songlai Han, Hui Zhang and HuaPing Zhou, Decision tree classification algorithm based ontheassociated function. SiChuan: Computer application. 2005,25(11): 2655~2657(In Chinese).

Google Scholar

[5] Ke Luo and Xue-mao Zhang. SPRINT algorithm and its improvement. Computer Engineering and Applications. 2005,32: 178~179(In Chinese).

Google Scholar

[6] Hongning Wei, Parallel decision tree classification Based on the SPRINT method. ChengDu: Southwest Jiaotong University. 2005,25(1): 40~41(In Chinese).

Google Scholar

[7] Jun Feng. The Research and implementation for distributed search engine Based on Hadoop. TaiYuan: TaiYuan University of Technology. (2010).

Google Scholar

[20] Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 1986, 81~106(In Chinese).

Google Scholar

[8] C. Moretti, K. Steinhaeuser, D. Thain, and N. V. Chawla, Scaling upclassifiers to cloudcomputers," in ICDM, 08: Proceedings of the 8th IEEE International Conference on DataMining. 2008,472~481.

Google Scholar

[9] D. Gillick, A. Faria, and J. Denero, MapReduce: "Distributed Computing for MachineLearninghttp: /www. icsi. berkeley. edu/~arlo/publications/gillick_cs262a_proj. pdf, 2006, 1-12.

Google Scholar