Massive Data Analysis Based on Hadoop Distributed Parallel Computing Framework

Article Preview

Abstract:

With the development of Internet, Mobile Internet and The Internet of things, no one can deny that the era of massive data has come. Hadoop distributed system provides reliable parallel computing service by cheap PC clustering. It has irreplaceable advantages in scalability, robustness, calculated performance and cost. In fact, it has been a mainstream analysis platform of big data. In this paper, the classification characteristics of big data analysis were enumerated, the Pig platform system architecture was explained mainly. The functions of Pig platform engine front and after were introduced exactly. When Pig platform processes Pig Latin script, logic plan, physical plan and MapReduce plan will be produced. The concrete process was illustrated combined with examples in paper.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2765-2769

Citation:

Online since:

August 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] A. Gates,O. Natkovich etc. Building a high-level dataflow system on top of Map-Reduce: The Pig experience. In Proc. of VLDB, 2009: 1414-1425).

DOI: 10.14778/1687553.1687568

Google Scholar

[2] C. Olston, B. Reed, U. Srivastava,R. Kumar and A. Tomkins. Pig Latin: A not-so-foreign language for dataprocessing. In Proc. of ACM SIGMOD, (2008).

DOI: 10.1145/1376616.1376726

Google Scholar

[3] Li Boduo, Mazur Edward, Diao Yanlei , McGr egor Andrew , Shenoy Prashant J. A platform for scalable one-pass analytics using MapReduce/Proceedings of the ACM SIGMOD International Conferen ce on Management of Dat a( SIGM OD' 11) . Athens , Greece, 2011: 985-996.

DOI: 10.1145/1989323.1989426

Google Scholar

[4] Nykiel T , Potamias M , Mishra C , Kollios G, Koudas N. MRShar e: Sharing across multiple queries in MapRedu ce. PVLDB, 2010, 3( 1) : 494-505.

DOI: 10.14778/1920841.1920906

Google Scholar

[5] http: / / www . asterdata. com / product / mapreduce. php.

Google Scholar

[6] http: / / www . greenplum. com/ technology/ mapreduce.

Google Scholar

[7] http: / / hive. apache. org.

Google Scholar