Research of Hadoop Parameters Tuning Based on Function Monitoring

Article Preview

Abstract:

Hadoop is a popular software framework supports distributed processing of large data sets. However, with Hadoop being a relatively new technology, practitioners and administers often lack the expertise to tune it to get better performance. Hadoop parameters configuration is one of the key factors which influence the performance. In this article, we present a novel Hadoop parameters tuning method based on function monitoring. This method monitors the function call information during task run to analyze why the performance of Hadoop changes when tuning parameters, which will be helpful for practitioners and administer to tune parameters to get better performance.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

264-270

Citation:

Online since:

August 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] J. Dean, S. Ghemawat, MapReduce: simplified data processing on large clusters, Communications of the ACM, vol. 51, pp.107-113, January (2008).

DOI: 10.1145/1327452.1327492

Google Scholar

[2] Apache Hadoop. http: /hadoop. apache. org.

Google Scholar

[3] Herodotou H, Babu S, Profiling, what-if analysis, and cost-based optimization of MapReduce programs, Proc. of the VLDB Endowment, vol. 4, pp.1111-1122, August (2011).

DOI: 10.14778/3402707.3402746

Google Scholar

[4] Aspect-oriented programming, http: /en. wikipedia. org/wiki/Aspect-oriented_programming.

Google Scholar

[5] AspectJ , http: /en. wikipedia. org/wiki/AspectJ.

Google Scholar

[6] Kambatla K, Pathak A, Pucha H, Towards optimizing hadoop provisioning in the cloud, in Proc. of the First Workshop on Hot Topics in Cloud Computing. 2009, p.118.

Google Scholar

[7] T. Osogami and S. Kato, Optimizing system configurations quickly by guessing at the performance, Perform Eval Rev, San Diego, vol. 35, p.145–156, (2007).

DOI: 10.1145/1269899.1254899

Google Scholar

[8] Lakkimsetti P K, A framework for automatic optimization of MapReduce programs based on job parameter configurations, (2011).

Google Scholar

[9] Herodotou H, Lim H, Luo G, et al. Starfish: A self-tuning system for big data analytics, in Proc. of the Fifth CIDR Conf, Asilomar, 2011. pp.261-272.

Google Scholar