p.2749
p.2753
p.2757
p.2761
p.2765
p.2770
p.2776
p.2781
p.2786
Massive Data Analysis Based on Hadoop Distributed Parallel Computing Framework
Abstract:
With the development of Internet, Mobile Internet and The Internet of things, no one can deny that the era of massive data has come. Hadoop distributed system provides reliable parallel computing service by cheap PC clustering. It has irreplaceable advantages in scalability, robustness, calculated performance and cost. In fact, it has been a mainstream analysis platform of big data. In this paper, the classification characteristics of big data analysis were enumerated, the Pig platform system architecture was explained mainly. The functions of Pig platform engine front and after were introduced exactly. When Pig platform processes Pig Latin script, logic plan, physical plan and MapReduce plan will be produced. The concrete process was illustrated combined with examples in paper.
Info:
Periodical:
Pages:
2765-2769
Citation:
Online since:
August 2013
Authors:
Price:
Сopyright:
© 2013 Trans Tech Publications Ltd. All Rights Reserved
Share:
Citation: