Design and Implementation of Massive Data Retrieving Based on Cloud Computing Platform

Article Preview

Abstract:

Considering the low efficiency of massive data retrieving in traditional parallel processing, by taking advantage of the great availability of cloud computing paradigm, we propose a hybrid solution based on Map-Reduce model and distributed computing framework--Spark. Moreover, we design and implement this solution in our lab. The results show that the solution can effectively improve the performance of massive data retrieving.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2235-2240

Citation:

Online since:

February 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] 2010 Digital Universe Study [EB/OL]. http: /gigaom. files. wordpress. com/2010/05/2010-digital-universe-iview-5-4-10. pdf. 2010. 7. 27.

Google Scholar

[2] SCHEN Kang, Weimin ZHENG. Cloud Computing: System Instances and Current Research [J]. Journal of Software, vol. 20, no. 5, 2009, pp.1337-1348. (In Chinese).

Google Scholar

[3] Peng WANG, Huafeng HUANG, Qin CAO. Cloud Computing: China Future IT Tactic [M]. Beijing: People's Post and Telecom Press, 2010. (In Chinese).

Google Scholar

[4] Armbrust M, Fox A, Grifth R, et al. Above the Clouds: A Berkeley View of Cloud Computing[R] / Technical Report No. UCB/ EECS-2009-28. Berkeldy: Department of Electrical Engineering and Computer Sciences, University of California, (2009).

Google Scholar

[5] Lizhe WANG, Gregor von LASZEWSKI, Andrew YOUNGE, Xi HE. Cloud Computing: a Perspective Study. Nanyang Technological University. 2009, no. 14.

Google Scholar

[6] M. Armbrust, A. Fox, R. Griffith et al. A view of cloud computing [J]. Communications of the ACM, vol. 53, no. 4, pp.50-58, (2010).

Google Scholar

[7] R. Buyya, C. S. Yeo, S. Venugopal et al. Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility [J]. Future Generation Computer Systems, vol. 25, no. 6, pp.599-616, (2009).

DOI: 10.1016/j.future.2008.12.001

Google Scholar

[8] Spark: http: /www. spark-project. org.

Google Scholar

[9] http: / bbs. Sciencenet. cn/cn/hom. php? mod=space&uid=425672&do=blog&id= 520947.

Google Scholar

[10] Apache Mesos. http: / incubator. Apache. Org mesos.

Google Scholar

[11] J Dean, S Ghemawat. Communications of the ACM, vol. 51, no. 1, 2008, pp.107-113.

Google Scholar

[12] Jeffrey Deanand Sanjay Ghemawat. MapReduce: Simplified DataProcessing on Large Clusters. (2004).

Google Scholar

[13] Gluster: http: /www. gluster. org.

Google Scholar