New Solution for Small File Storage of Hadoop Based on Prefetch Mechanism

Hui Xiang Zhou; Qiao Yan Wen

doi:10.4028/www.scientific.net/AMR.981.205

Paper Titles

Web Pages Mining Based on Terms and Formal Concept Analysis
p.187

The Research and Design of the Intelligence Application Platform in Public Security E-Government
p.192

VEET: 3D Virtual Electrical Experimental Tool Supporting Multi-Modal User Interfaces and Platforms
p.196

A Kind of Distributed Simulation Runtime Infrastructure Based on Grid
p.200

New Solution for Small File Storage of Hadoop Based on Prefetch Mechanism
p.205

A Game Theoretic Analysis of Resource Pricing and Sharing in P2P Networks
p.209

Using Untiy 3D Game Development Platform to Develop Low Cost Online Real Estate Display System
p.213

A Design of IT Enterprise Information Platform in Cloud Environment
p.218

Algorithm for Discovering Community in Multi-Relational Social Network Based on Modified Common Neighbors Similarity
p.222

HomeAdvanced Materials ResearchAdvanced Materials Research Vol. 981New Solution for Small File Storage of Hadoop...

New Solution for Small File Storage of Hadoop Based on Prefetch Mechanism

Abstract:

Hadoop performance a significant advantage in dealing with large files, but it is ineffective if we use Hadoop to handle a large number of small files, because the physical address of the Hadoop file is stored in a single Namenode. Suppose that the size of a small file is 100Byte, if there are such a large number of these small files, it may lead to greatly reduce the utilization of Namenode memory, and due to the large number of small files make the index directory increase, it also lower the rate of user accessing to files. To solve the problem described above, this paper propose a new solution for small file storage of Hadoop based on prefetch mechanism, experiment shows that this solution can effectively improve the memory utilization of Namenode and significantly improve the speed of user accessing.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Advanced Materials Research (Volume 981)

Pages:

205-208

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.981.205

Citation:

Cite this paper

Online since:

July 2014

Authors:

Hui Xiang Zhou*, Qiao Yan Wen

Keywords:

Cloud Storage, Hadoop, HDFS, Namenode, Prefetch, Small File

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] White T. Hadoop: The Definitive Guide: The Definitive Guide[M]. O'Reilly Media, (2009).

Google Scholar

[2] http: /Hadoop. apache. o-rg/common/docs/r0. 20. 2/Hadoop_archiv.

Google Scholar

[3] Borthakur D. The hadoop distributed file system: Architecture and design[J]. (2007).

Google Scholar

[4] Borthakur D. HDFS architecture guide[J]. Hadoop Apache Project. http: /hadoop. apache. org/common/docs/current/hdfs_design. pdf, (2008).

Google Scholar

[5] Liu X, Han J, Zhong Y, et al. Implementing WebGIS on Hadoop: A case study of improving small file I/O performance on HDFS[C]/Cluster Computing and Workshops, 2009. CLUSTER'09. IEEE International Conference on. IEEE, 2009: 1-8.

DOI: 10.1109/clustr.2009.5289196

Google Scholar

[6] Boulon J, Konwinski A, Qi R, et al. Chukwa, a large-scale monitoring system[C]/ Proceedings of CCA. 2008, 8.

Google Scholar

[7] http: /wiki. apache. org/Hadoop/SequenceFile.

Google Scholar

[8] White T. Hadoop: The Definitive Guide: The Definitive Guide[M]. O'Reilly Media, (2009).

Google Scholar

[9] Yang H, Dasdan A, Hsiao R L, et al. Map-reduce-merge: simplified relational data processing on large clusters[C]/Proceedings of the 2007 ACM SIGMOD international conference on Management of data. ACM, 2007: 1029-1040.

DOI: 10.1145/1247480.1247602

Google Scholar